Basic Operations on Images¶

목표¶

pixel의 intensity 또는 color vector의 값을 읽거나 수정하기.
image의 property들을 확인하기
ROI 설정하기.
image를 여러 축으로 나누거나 합치기.

OpenCV의 Python Binding에서

기본 데이터 구조로 NumPy의 ndarray를 채택하고 있기 때문에
NumPy의 ndarray를 다루는 방법으로 image를 처리할 수 있다.

Pixel 개개의 값에 접근하고 수정하기.¶

우선 가장 간단한 gray-scale image에서 접근 및 수정은 결국 pixel intensity가 scalar이므로 단일값(scalar)에 대한 읽기와 수정이 된다.

import cv2
import numpy as np
import requests

def get_img(url,is_gray=True):
    image_ndarray = np.astray(bytearray(requests.get(url).content), dtype=np.uint8)
    mode = cv2.IMREAD_UNCHANGED
    if is_gray:
        mode = cv2.IMREAD_GRAYSLAKE
    img = cv2.decode(image_ndarray, mode)
    return img

url = 'https://raw.githubusercontent.com/dsaint31x/OpenCV_Python_Tutorial/master/images/cat_cv.tif'
src = get_img(url)

if src is None:
    print('Error: invalid url!')
else:
    img = src.copy()

intensity = img[100,100]
print(f'img[100,100]={intensity}')

결과는 다음과 같음:

img[100,100]=122

Color image의 경우, 3~4개의 elements를 가진 vector가 한 pixel에 할당되어 있다.
각 element를 channel 혹은 depth라고 부른다: BGR(A)

import cv2
import numpy as np
import matplotlib.pyplot as plt

url = 'https://raw.githubusercontent.com/dsaint31x/OpenCV_Python_Tutorial/master/images/messi5.jpg'

src = get_img(url,False)

if src is None:
    print('Error : Loading image')
else:
    print('OK : Loading image')
    img = src.copy()

intensity = img[100,100]
print(f'img[100,100]={intensity}')

결과는 다음과 같음:

OK : Loading image
img[100,100]=[157 166 200]

각 pixel의 값 수정은 다음과 같음.

img[100,100] = [255,255,255]
print(img[100,100])

Warning

NumPy는 array(배열) 구조에 최적화된 라이브러리이기 때문에,

for문 등을 통해 pixel 하나씩 접근하는 방식은 권장되지 않음.

특정 region을 보는 방법은 slicing을 이용한다.

tmp = img[200:250,-300:-100]
print(tmp.shape)
plt.imshow(tmp[...,::-1])
plt.axis('off')
plt.show()

결과는 다음과 같음:

`ndarray.item()` and `ndarray.itemset()`¶

item()과 itemset()을 이용한 방식은
앞서 살펴본 개별 pixel에 접근하는 것 보다는 나은 방법이긴 하지만,
역시 개별 접근에 해당함.

가급적 하나의 pixel에 접근하는 방법보다
slicing을 통한 접근 또는 matrix 연산 또는 broadcasting등을 활용한 처리가 효과적이다.

Note¶

For individual pixel access, Numpy array methods, array.item() and array.itemset() is considered to be better.

But it always returns a scalar.

So if you want to access all B,G,R values, you need to call array.item() separately for all.

읽기

# accessing RED value
img.item(10,10,2)

쓰기

# modifying RED value
img.itemset((10,10,2),100)
img.item(10,10,2)

Accessing Image Properties¶

Image properties include

image의 dimension information.
- height, width, depth 정보들
image의 기본 데이터 type
- OpenCV는 주로 unit8을 사용함.
전체 데이터 element의 수
- color image의 경우 1개의 pixel이 3개의 element로 구성됨.
etc.

img.shape: image의 dimension info.에 해당하는 tuple.
NumPy의 ndarray이므로 (rows,columns,channels) 로 구성된다.
즉, 높이가 먼저 놓임.

>>> print(img.shape)
(342, 548, 3)

img.dtype: image에 해당하는 NumPy의 ndarray의 한 요소의 데이터 타입.
OpenCV의 경우 주로 uint8을 사용함.

>>> print(img.dtype)
uint8

img.size: image에 해당하는 NumPy의 ndarray의 요소(element)들의 총 갯수.

>>> print(img.size)

>>> size = 1
>>> for i in img.shape:
>>>     size=size *i
>>> size = 0 if size == 1 else size
>>> print(size)

562248
562248

OpenCV에서 지원하는 다양한 function들이 요구하는 dtype를 맞게 사용해야 한다.
주로 uint8을 사용하긴 하지만, 반드시 API로 확인할 것.

Image ROI¶

ROI는 Region of Interest 의 약어임.
ROI 지정은 NumPy의 slicing과 indexing을 활용하여 이루어진다.

다음 예제는 축구공을 ROI로 삼아서 이를 다른 영역에 복사해넣음으로서 원본과 달리 축구공이 2개 존재하게 바꾼다. 간단하게 ROI를 어떻게 지정하고 값의 변경을 어떻게 할 수 있는지를 보여준다.

import cv2
import sys
import os

d_str = os.path.dirname(__file__)
fstr = os.path.join(d_str,'img/messi5.jpg')

img_ori = cv2.imread(fstr)
if img_ori is None:
    sys.exit(f'There is not a file:{fstr}')
cv2.imshow('original image',img_ori) #expects true color
img = img_ori.copy()

ball = img[280:340,330:390] # numpy is based on matrix. Note that x is width and column, y is height and row.
print(ball.shape)
img[273:333,100:160] = ball

cv2.imshow('modified image',img)

# # ------------------------
# # for simplicity! In this case, never use x button on window to close it.
# cv2.waitKey(0)
# cv2.destroyAllWindows()

# ---------------------------
# In this code, x button is supported
while cv2.getWindowProperty('modified image', cv2.WND_PROP_VISIBLE) >= 1:
    k = cv2.waitKey(10)
    if k&0xff == 27:
        break
    if cv2.getWindowProperty('original image', cv2.WND_PROP_VISIBLE) < 1:
        break
cv2.destroyAllWindows()

Mouse 를 활용하여 ROI 지정하기¶

다음 예제는 cv2.selectROI함수를 사용하여, 마우스로 사각형 모양의 ROI를 지정하는 방법을 보여준다.

cv2.selectROI함수의 사용법은 다음과 같다.

ret_val = cv2.selectROI([window_name], img [, showCrossHair=True, fromCenter=False]

window_name : ROI 선택을 수행할 window이름. str
img : 보여질 이미지.
showCrossHair : ROI 중심에 십자모양 표시 여부
fromCenter : 마우스 시작지점을 영역의 중심으로 간주
ret_val = (x,y,w,h) of ROI

선택시 space or Enter

선택을 취소하고 싶을 경우 c키 누름 : ret_val는 모두 0으로

전체 예제 코드는 다음과 같다.

import cv2
import numpy as np
import os

BASE_DIR = os.path.dirname(os.path.abspath(__file__))  

# img_path = '/home/dsaint31/lecture/OpenCV_Python_Tutorial/images/lena.png'
img_path = os.path.join(BASE_DIR,"img/lena.png")

img = cv2.imread(img_path)
cv2.imshow('img', img)
x,y,w,h = cv2.selectROI('img', img, False)

print(cv2.getWindowProperty('img', cv2.WND_PROP_VISIBLE))
if w and h:
    roi = img[y:y+h, x:x+w]
    cv2.imshow('roi', roi)
    cv2.moveWindow('roi',0,0)
    cv2.imwrite('roi2.png', roi)

while cv2.getWindowProperty('img', cv2.WND_PROP_VISIBLE) >=1: # check whether x button is clicked.
    key_code = cv2.waitKey(50)&0xff #50msec
    print(key_code)
    if key_code == 27: #check whether ESC key is entered.
        break

cv2.destroyAllWindows()

다음 예제는 앞서 배운 mouse callback event를 통해 위의 예제를 구현한 것이다.
(즉, cv2.selectROI를 사용하지 않고 구현한 경우임)
비교해보면 알 수 있듯이 cv2.selectROI를 사용하는 게 훨씬 코드가 단순하고 편하다. callback을 이해하기 위해서 작성해보는 것은 좋으나, 굳이 만들어 쓰는 것보다는 기존의 구현된 것들을 잘 활용하는 습관을 들이는게 낫다.

import cv2
import numpy as np
import os

is_dragging = False

x0,y0 = -1,-1
w0,h0 = -1,-1
red = (0,0,255)

def onMouse(event, x, y, flags, param):

    global is_dragging
    global x0,y0,w0,h0

    if event == cv2.EVENT_LBUTTONDOWN:
        is_dragging = True
        x0 = x
        y0 = y
        print(x0,y0)

    elif event == cv2.EVENT_MOUSEMOVE:
        if is_dragging:
            tmp = img.copy()
            cv2.rectangle(tmp, (x0,y0), (x,y), red, 2)
            cv2.imshow('roied_img', tmp)

    elif event == cv2.EVENT_LBUTTONUP:
        if is_dragging:
            is_dragging = False
            w = x-x0
            h = y-y0

            if w>0 and h > 0:
                tmp = img.copy()

                cv2.rectangle(tmp, (x0,y0), (x,y), red, 3)
                cv2.imshow('roied_img', tmp)
                roi = img[y0:y0+h, x0:x0+w]
                #roi = img[x0:x0+w, y0:y0+h]
                cv2.imshow('roi', roi)
                cv2.moveWindow('roi',0,0)
                cv2.imwrite('./roi.png',roi)
                print('roi is cropped and saved!')
            else:
                cv2.imshow('roied_img', img)
                print('invalid roi!. select roi carefully')

    return

# img = cv2.imread('/home/dsaint31/lecture/OpenCV_Python_Tutorial/images/lena.png')
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
img_path = os.path.join(BASE_DIR, "../../../images/lena.png")
img = cv2.imread(img_path)
cv2.imshow('roied_img', img)
cv2.setMouseCallback('roied_img',onMouse)

while cv2.getWindowProperty('roied_img', cv2.WND_PROP_VISIBLE) >= 1:
    key_code = cv2.waitKey(50) # millisecond
    if key_code == 27:
        # print('ESC')
        break;
    elif cv2.getWindowProperty('roied_img', cv2.WND_PROP_VISIBLE) < 1:
        # print('closed button')
        break;
    elif cv2.getWindowProperty('roi', cv2.WND_PROP_VISIBLE) < 1:
        # print('closed button')
        cv2.destroyWindow('roi')

cv2.destroyAllWindows()

Splitting and Merging Image Channels.¶

OpenCV는 NumPy의 ndarray를 사용하므로, NumPy의 dsplit 등을 통해 channel을 나눌 수 있다.

OpenCV에서도 split과 merge를 지원한다.

주의할 점은 이 NumPy의 dsplit와 OpenCV의 split 들의 결과물에서 dimension이 다르다는 점이다. 다음 예제를 통해 확인하라.

import cv2
import matplotlib.pyplot as plt
import numpy as np

# opencv
b,g,r = cv2.split(img)
img_rgb = cv2.merge((r,g,b))

plt.imshow(img_rgb)
plt.xticks([]),plt.yticks([])
plt.show()

print(b.shape)

matplotlib에서 RGB color space를 사용하므로, 위의 예제를 통해 올바른 color의 image를 확인 가능하다.

단, 마지막의 b.shape는

cv2.split을 통해 분리된 Blue channel의 image이며,
ndim=2로 shape가 (342,548)로 나온다.

이와 달리, NumPy의 dsplit을 사용한 경우,

원래의 ndim을 변경하지 않기 때문에,
(342,548,1)의 형태로 나오게 된다.

import cv2
import matplotlib.pyplot as plt
import numpy as np

# numpy
b2,g2,r2 = np.dsplit(img,3)

plt.imshow(b2)
plt.xticks([]),plt.yticks([])
plt.show()
print(b2.shape)

OpenCV에서 mono-channel image를 다룰 때는 ndim=2인 형태를 선호하므로 맨 뒤의 의미없는 1로 나오는 channel axis를 제거해야할 경우가 많다.
이 경우 많이 사용하는 방법이 NumPy에서 제공하는 squeeze함수이다.

이는 크기 1에 불과한 axis들을 제거해준다(newaxis와 반대라고 생각하면 된다.).

>>> print(b2.shape)
>>> b2 = np.squeeze(b2)
>>> print(np.array_equal(b,b2))
>>> print(b.shape)
>>> print(b2.shape)

(342, 548, 1)
True
(342, 548)
(342, 548)

참고로 cv2.split은 수행 시간이 매우 많이 요구되는 costly operation임. 가급적 NumPy의 indexing을 이용하여 BGR을 RGB로 바꾸는 방법을 사용하는 것이 좋다.

Basic Operations on Images¶

목표¶

Pixel 개개의 값에 접근하고 수정하기.¶

ndarray.item() and ndarray.itemset()¶

Note¶

Accessing Image Properties¶

Image ROI¶

Mouse 를 활용하여 ROI 지정하기¶

Splitting and Merging Image Channels.¶

References¶

`ndarray.item()` and `ndarray.itemset()`¶