Build a multimodel RAG application effortlessly with Gemini pro vision by Google

Created by Sriharsha Velicheti in Articles 8 Feb 2024

Insta-Tagline-Generator, an innovative tool designed to enhance Instagram posts with AI-powered taglines

In today's Gen Z era, characterized by the digital savviness of its populace and the ubiquity of social media platforms, tools like the Insta-Tagline-Generator are becoming increasingly indispensable. This application, which employs the advanced capabilities of Google's Gemini Pro Vision AI model, illustrates a significant leap in how technology is harnessed to elevate social media content.

By automatically generating contextually relevant and engaging taglines for Instagram posts, it not only streamlines content creation but also enriches the user's digital footprint. This blending of AI with social media strategy underscores a broader trend where technology is not just a facilitator but also an enhancer of creativity and personal brand building in the digital age, making it a quintessential tool for influencers, brands, and the average social media user alike.

To access this innovative technology, users must have their Gemini Pro API key ready, obtainable from Google's API key portal. This key is pivotal for utilizing the AI capabilities that analyze images and craft engaging taglines, blending technological innovation with creative expression for impactful social media engagement.

Environment and API key Setup:

from dotenv import load_dotenv

load_dotenv() # load all the variables from .env

import streamlit as st

import os

from PIL import Image

import google.generativeai as genai

genai.configure(api_key = os.getenv("GOOGLE_API_KEY"))

This section imports necessary libraries and loads environment variables, particularly the GOOGLE_API_KEY. It sets up the foundation for the application by ensuring access to the Google Generative AI model via API key.

Streamlit Interface

st.set_page_config(page_title = "Tagline Generator for instagram posts")

st.header("Tagline Generator for instagram posts")

input = st.text_input("Input Prompt: ",key="input")

uploaded_file = st.file_uploader("Choose an image of your choice..",type=['jpg','jpeg','png'])

This code uses Streamlit to create a web interface for the app. It sets the page title, displays a header, and creates input fields for users to enter a prompt and upload an image.Also this upload file supports three different kinds of image formats namely jpg, jpeg, png.

Image Processing

if uploaded_file is not None:

image = Image.open(uploaded_file)

st.image(image,caption="uploaded Image",use_column_width=True)

Upon uploading an image, this section uses the PIL library to open and display the image on the Streamlit interface, confirming the upload to the user.

Tagline Generation

This is the area where magic happens we will get our response from Gemini pro vision model API once the input image is read by it.

def get_gemini_respose(input,image,prompt):

response = model.generate_content([input,image[0],prompt])

return response.text

def input_image_details(uploaded_file):

if uploaded_file is not None:

bytes_data = uploaded_file.getvalue()

image_parts = [{"mime_type":uploaded_file.type, "data":bytes_data}]

return image_parts

else:

raise FileNotFoundError("No file uploaded")

Defines functions to process the uploaded image and communicate with the Gemini Pro Vision AI model. It prepares the image data and sends it along with the user's prompt to generate a tagline.

Generating and Displaying Tagline using streamlit

submit = st.button("Generate Tag line")

input_prompt = """

Generate creative and engaging taglines suitable for Instagram posts.

The taglines should be concise, catchy, and relevant to the details of the image

Incorporate popular and contextually appropriate emojis to enhance the emotional appeal and visual interest of each tagline.

Ensure the language is positive, inclusive, and resonates with a diverse, global audience.

reflecting a range of tones from the given images

"""

if submit:

image_data = input_image_details(uploaded_file)

response = get_gemini_respose(input_prompt,image_data,input)

st.subheader("The Response is : ")

st.write(response)

This segment adds a button to trigger tagline generation. It defines a detailed prompt for the AI and, upon clicking "Generate Tagline", processes the image and prompt to display the generated tagline.

Once all the code is ready make sure you follow the below steps to run that app smoothly:

Install Python Libraries: Execute the command pip install streamlit Pillow google-generativeai python-dotenv langchain PyPDF2 chromadb in your terminal. This installs Streamlit for web app creation, Pillow for image processing, Google's generative AI library for AI model access, and other dependencies necessary for the app's functionality.

Configure .env File: Create a .env file in the root directory of your project. Inside this file, include the line GOOGLE_API_KEY=your_api_key_here, replacing your_api_key_here with your actual API key obtained from Google. This step is crucial for authenticating your application's access to the Google AI model.

Run the Application: Launch your application by opening your terminal, navigating to the directory where your app script (your_app_script.py) is located, and running the command streamlit run your_app_script.py. This command starts the Streamlit server and opens the app in your default web browser.

Thats it Once you press enter your app is up and running on the web server locally you can even develop new ideas with similar idea like invoice text extractor e.t.c try different prompts with your Instagram posts and let me know in the comments.An example prompt can be " generate a sarcastic meme on the given image"

Enjoy coding have fun with the application also try and experiment different variations out of it. if you find this insightful consider following me

keep learning thank you !!