metadata-blog-post-image

Python Tool: Quick Metadata, Stego, and Reverse GIS

A lightweight Python tool that accomplishes 3 objectives for the DFIR professional (scroll down for a more advanced explanation of each function):
  • Extracts metadata from media files, and prints out any metadata that’s found
  • Checks for steganography in both images and video frames
  • Performs a reverse GIS on images
import exifread
from stegano import lsb
from google_images_search
import GoogleImagesSearch
import cv2

# initialize the GoogleImagesSearch object with your API key and project CX
gis = GoogleImagesSearch('your_api_key', 'your_project_cx')
def get_metadata(filepath):
with open(filepath, 'rb') as f:
# read the metadata from the file
tags = exifread.process_file(f)

# print out any hidden metadata found
for tag in tags:
if tag.startswith("Image") or tag.startswith("EXIF"):
print(f"{tag}: {tags[tag]}")

# check for steganography in image files
if filepath.endswith(".jpg") or filepath.endswith(".png"):
secret = lsb.reveal(filepath)
if secret:
print(f"Steganography detected: {secret}")

# perform reverse image search on image files
gis.search_image(filepath)
for image in gis.results():
print(f"Matching image found: {image.url}")

# check for steganography in video frames
if filepath.endswith(".mp4") or filepath.endswith(".avi"):
cap = cv2.VideoCapture(filepath)
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
cv2.imwrite(f"temp_frame_{frame_count}.jpg", frame)
get_metadata(f"temp_frame_{frame_count}.jpg")
cap.release()

# example usage
get_metadata("example.jpg")
get_metadata("example.mp4")
Here’s a breakdown of how the script works:
  1. We begin by importing the necessary libraries:
    • exifread is used to extract metadata from media files.
    • stegano is used to detect hidden messages in image files.
    • google_images_search is used to perform reverse image searches on image files.
    • cv2 is used to extract frames from video files.
  2. Next, we create an instance of GoogleImagesSearch and pass in our API key and project cx. This will allow us to search for similar images on Google.
  3. The get_metadata() function takes a file path as input and checks if the file is an image or video file.
  4. If the file is an image file, we first use exifread to extract any metadata that may be hidden in the file. We loop through all the tags in the metadata and print out any that start with “Image” or “EXIF”.
  5. Next, we use stegano to check if there are any hidden messages in the image file. If a hidden message is found, we print it out.
  6. We then use google_images_search to perform a reverse image search on the image file. We call gis.search_image(filepath) to search for similar images on Google, and loop through the results with gis.results(). For each result, we print out the URL of the matching image.
  7. If the file is a video file, we use cv2 to extract frames from the video file. We loop through all the frames and save them as temporary image files. We then call get_metadata() on each temporary image file to perform the same metadata and steganography detection as we did for image files.
  8. Finally, we call get_metadata() on the input file to run the entire script.
Note that this script can potentially violate privacy and security, so make sure you have proper authorization and use the script responsibly. Also, extracting frames from video files can be time-consuming and memory-intensive, so it may not be practical for very long videos or for systems with limited hardware resources.