End to End Face Attendance System using TensorFlow

5 min readJul 20, 2023

In this series, till now we have covered how to detect the face and how to verify two faces using cosine distance having Facenet architecture as a backend.

In this article, we will learn about how to do the real time verification from the webcam stream against an image.

pip install opencv-python

after installing opencv, we need to connect to the camera and receive the video stream.

capture = cv2.VideoCapture(0) # here index refers to the camera device.


while capture.isOpened(): # when the connected camera is available to open execute the below code
    _,frame = capture.read() # read the camera stream frame by frame
    
    cv2.imshow("face recognition",frame) # we are displaying the captured frame, a 
  # a new pop window will be generated and the frames are displayed on that window

    if(cv2.waitKey(1) & 0XFF == ord('q')): 
# if cv2.waitkey is it will wait for keyboard response for 1 milliseconds,
# if the pressed key is equal to q then we are breaking out of loop.
        break
       
       
capture.release() # releasing the camera which is connected before
cv2.destroyAllWindows() # all window opened for displaying the image will be closed

I clearly explained what each command does in the code snipped.

After receiving the frame , we need to preprocess the frame for detection of face in the image and also for getting the embedding as we did earlier in the previous articles.

## before running this ,please set the anchor to  path of the image available
def facerecognition(frame,anchor="marnus.jpeg"):
    
    ## we are checking the similarity of frame against the anchor
    ## anchor is already predefined
    ## as cv2 will read the image in BGR format , we need to convert it to RGB
    
    ## if the first image is not numpy , we need to read 
    if type(frame).__module__ != np.__name__:
        frame = cv2.imread(frame)
    
    frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
    anchor = cv2.imread(anchor)
    anchor = cv2.cvtColor(anchor,cv2.COLOR_BGR2RGB)

        
    detect1 = detectmodel.process(frame)
    detect2 = detectmodel.process(anchor)
    
    bbox1 = detect1.detections[0].location_data.relative_bounding_box
    bbox2 = detect2.detections[0].location_data.relative_bounding_box

above function takes the frame which is captured on the video stream, and anchor represent the image that you want to verify against, choose any image that you desire.

Initally ,im checking whether the frame passed is numpy array or path to the image filed , Keep in mind that opencv will return the numpy array from the stream.

if frame is image path , we need to read the image using opencv.

Usually opencv work with BGR channels for god’s sake, we need to convert to RGB as our model expects the image passed to be in RGB format.

Then i added a function to the above code , for cropping the face and resize as we want .

## before running this ,please set the anchor to  path of the image available
def facerecognition(frame,anchor="marnus.jpeg"):
    
    ## we are checking the similarity of frame against the anchor
    ## anchor is already predefined
    ## as cv2 will read the image in BGR format , we need to convert it to RGB
    
    ## if the first image is not numpy , we need to read 
    if type(frame).__module__ != np.__name__:
        frame = cv2.imread(frame)
    
    frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
    anchor = cv2.imread(anchor)
    anchor = cv2.cvtColor(anchor,cv2.COLOR_BGR2RGB)

        
    detect1 = detectmodel.process(frame)
    detect2 = detectmodel.process(anchor)
    
    bbox1 = detect1.detections[0].location_data.relative_bounding_box
    bbox2 = detect2.detections[0].location_data.relative_bounding_box

    
    ## cropping the image at the bounding box 
    crop1 = crop(bbox1,frame)
    crop2 = crop(bbox2,anchor)

After passing the frame to the mediapipe detection model, we will get the bbox for face.

we are passing the bbox and frame to the crop function.

def crop(bbox,image):
    lx = int(bbox.xmin*image.shape[1])
    ly = int(bbox.ymin*image.shape[0])
    rx = int(bbox.width*image.shape[1]+lx)
    ry = int(bbox.height*image.shape[0]+ly)
    crop = image[ly:ry,lx:rx]
    return crop

I believe above code is self explanatory ,as bbox coordinates are normalized values we are converting to absolute pixel locations by multiplying the image width and height and then slicing the image which returns the cropped face.

Ok now we go the cropped face, then we need to resize the face image as per the desired model requirement size.

def resize(img,target_size):
    factor1 = target_size[0]/img.shape[0]
    factor2 = target_size[1]/img.shape[1]
    minfact = min(factor1,factor2)
    resize = (int(img.shape[1]*minfact),int(img.shape[0]*minfact))
    res = cv2.resize(img,resize)
    return res

above function resizes the passed image to the target size which is (160x160x3).

After resizing the image we need to add extra dimension to the image and pass it to the model for prediction of embedding.

    firstimage = np.expand_dims(rescrop1/255,0)
    secondimage = np.expand_dims(rescrop2/255,0)
    
    ## embedding of both image
    firstembed = model.predict(firstimage)[0]
    secondembed = model.predict(secondimage)[0]

After getting the embedding , pass the two embed to calculate the cosine distance.

def cosinedistance(embed1,embed2):
    a = np.matmul(np.transpose(embed1),embed2)
    b = np.sum(np.multiply(embed1,embed1))
    c = np.sum(np.multiply(embed2,embed2))
    cossim = a/(np.sqrt(b)*np.sqrt(c))
    distance = 1 - cossim
    return distance

I already explained this cosined distance in previous article , Please check out previous article in this series , if you are not understanding the above code.

here is the final code ,

## before running this ,please set the anchor to  path of the image available
def facerecognition(frame,anchor="marnus.jpeg"):
    
    ## we are checking the similarity of frame against the anchor
    ## anchor is already predefined
    ## as cv2 will read the image in BGR format , we need to convert it to RGB
    
    ## if the first image is not numpy , we need to read 
    if type(frame).__module__ != np.__name__:
        frame = cv2.imread(frame)
    
    frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
    anchor = cv2.imread(anchor)
    anchor = cv2.cvtColor(anchor,cv2.COLOR_BGR2RGB)

        
    detect1 = detectmodel.process(frame)
    detect2 = detectmodel.process(anchor)
    
    bbox1 = detect1.detections[0].location_data.relative_bounding_box
    bbox2 = detect2.detections[0].location_data.relative_bounding_box

    
    ## cropping the image at the bounding box 
    crop1 = crop(bbox1,frame)
    crop2 = crop(bbox2,anchor)

    ## resizing crop1
    rescrop1 = resize(crop1,(160,160,3))
    rescrop2 = resize(crop2,(160,160,3))
    
    firstimage = np.expand_dims(rescrop1/255,0)
    secondimage = np.expand_dims(rescrop2/255,0)
    
    ## embedding of both image
    firstembed = model.predict(firstimage)[0]
    secondembed = model.predict(secondimage)[0]
    
    ## cosine distance 
    distance = cosinedistance(firstembed,secondembed)
    print(distance)
    
    if(distance<=0.4):
        print("Same person")
    else:
        print("differant person")

Ok whoooo call this function ????

apture = cv2.VideoCapture(0)

print("Press v to verify")
print("-----------------")
print("Press q to close the cam")


while capture.isOpened():
    _,frame = capture.read()
    
    cv2.imshow("face recognition",frame)
    
    if(cv2.waitKey(1) & 0XFF == ord('v')):
        facerecognition(frame)
    if(cv2.waitKey(1) & 0XFF == ord('q')):
        break
       
       
capture.release()
cv2.destroyAllWindows()

The above code calls the face recognition , for each frame and prints both are same person or not. Thats it Thats it.

Based on the your requirement you can tweak the anchor image path as you want .

Summary of steps:

connect to video cam
receive the frame from the stream
preprocess the frame
detect the faces
resize the faces
calculate the embedding
calculate the cosine distance
verify the frame based on the threshold value defined.

Well done , we built a simple and got the core idea of how the face verifcation work in video stream.

Based on this simple , super easy idea we are going to develop an web application that can be used as a face attendece system and then we are going to deploy the same in amazon aws using nginx proxy.

Stay Tuned for the next phase,

Ta Ta Bye Bye.

End to End Face Attendance System using TensorFlow

Written by Mathanprasannakumar

No responses yet