Skip to content

gcp_image_loader

Documentation for GCPImageLoader

Functionality

This class is responsible for retrieving images from Google Cloud Platform's Cloud Storage. It extends the functionality of GCPDataLoader to load and convert image data into PIL Image objects.

Main Purpose

Provide an easy-to-use interface to load image data stored in Google Cloud Storage with an optional retry strategy and schema features.

Motivation

GCPImageLoader abstracts the complexities of fetching image data from remote cloud storage while ensuring robust image processing. It allows integration with image pipelines seamlessly.

Inheritance

GCPImageLoader inherits from GCPDataLoader, utilizing its cloud storage operations, and customizes the image extraction via the _get_item method.

Method: _get_item

Functionality

Resets the pointer of a BytesIO file and extracts an image. It converts raw image bytes into a PIL Image object.

Parameters

  • file: io.BytesIO containing the image data.

Return Value

  • Returns a PIL Image object based on the file data.

Usage

  • Purpose: To safely transform image bytes into a usable PIL Image instance.

Example

from io import BytesIO
from PIL import Image
from embedding_studio.data_storage.loaders.cloud_storage.gcp.gcp_image_loader import GCPImageLoader

# Load image bytes into a BytesIO object
file = BytesIO(image_bytes)

loader = GCPImageLoader()
image = loader._get_item(file)