Explain text support in File_API.ipynb notebook (#104)

* Pre-format with pyink * Add plain text support
google-gemini · May 7, 2024 · 7416a67 · 7416a67
1 parent 31dbaf5
commit 7416a67
Showing 1 changed file with 153 additions and 31 deletions.
diff --git a/quickstarts/File_API.ipynb b/quickstarts/File_API.ipynb
@@ -18,7 +18,7 @@
       },
       "outputs": [],
       "source": [
-        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
         "# you may not use this file except in compliance with the License.\n",
         "# You may obtain a copy of the License at\n",
         "#\n",
@@ -53,18 +53,17 @@
       },
       "source": [
         "The Gemini API supports prompting with text, image, and audio data, also known as *multimodal* prompting. You can include text, image,\n",
-        "and audio in your prompts. For small images, you can point the Gemini model\n",
-        "directly to a local file when providing a prompt. For larger images, videos\n",
-        "(sequences of image frames), and audio, upload the files with the [File\n",
-        "API](https://ai.google.dev/api/rest/v1beta/files) before including them in\n",
-        "prompts.\n",
+        "and audio in your prompts. Small files can be embedded directly into a prompt. For larger files, upload the files with the [File\n",
+        "API](https://ai.google.dev/api/rest/v1beta/files) before including them in prompts.\n",
         "\n",
         "The File API lets you store up to 20GB of files per project, with each file not\n",
         "exceeding 2GB in size. Files are stored for 48 hours and can be accessed with\n",
         "your API key for generation within that time period. It is available at no cost in all regions where the [Gemini API is\n",
         "available](https://ai.google.dev/available_regions).\n",
         "\n",
-        "For information on valid file formats (MIME types) and supported models, see [Supported file formats](https://ai.google.dev/tutorials/prompting_with_media#supported_file_formats).\n",
+        "For information on valid file formats (MIME types) and supported models, see the documentation on\n",
+        "[supported file formats](https://ai.google.dev/tutorials/prompting_with_media#supported_file_formats)\n",
+        "and view the text examples at the end of this guide.\n",
         "\n",
         "Note: Videos must be converted into image frames before uploading to the File\n",
         "API.\n",
@@ -79,17 +78,7 @@
       "metadata": {
         "id": "_d_yY8XWGQ12"
       },
-      "outputs": [
-        {
-          "name": "stdout",
-          "output_type": "stream",
-          "text": [
-            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m142.1/142.1 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
-            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m663.6/663.6 kB\u001b[0m \u001b[31m13.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
-            "\u001b[?25h"
-          ]
-        }
-      ],
+      "outputs": [],
       "source": [
         "!pip install -U -q google-generativeai"
       ]
@@ -112,7 +101,7 @@
         "id": "YdyC6Z6wqxz-"
       },
       "source": [
-        "### Authentication Overview\n",
+        "## Authentication\n",
         "\n",
         "**Important:** The File API uses API keys for authentication and access. Uploaded files are associated with the API key's cloud project. Unlike other Gemini APIs that use API keys, your API key also grants access data you've uploaded to the File API, so take extra care in keeping your API key secure. For best practices on securing API keys, refer to Google's [documentation](https://support.google.com/googleapi/answer/6310037)."
       ]
@@ -138,7 +127,7 @@
       "source": [
         "from google.colab import userdata\n",
         "\n",
-        "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
+        "GOOGLE_API_KEY = userdata.get(\"GOOGLE_API_KEY\")\n",
         "genai.configure(api_key=GOOGLE_API_KEY)"
       ]
     },
@@ -148,7 +137,7 @@
         "id": "c-z4zsCUlaru"
       },
       "source": [
-        "## Upload a file to the File API\n",
+        "## Upload file\n",
         "\n",
         "The File API lets you upload a variety of multimodal MIME types, including images and audio formats. The File API handles inputs that can be used to generate content with [`model.generateContent`](https://ai.google.dev/api/rest/v1/models/generateContent) or [`model.streamGenerateContent`](https://ai.google.dev/api/rest/v1/models/streamGenerateContent).\n",
         "\n",
@@ -161,7 +150,7 @@
         "id": "2wsJ0vHNNtdJ"
       },
       "source": [
-        "First, we will prepare a sample image to upload to the API.\n",
+        "First, prepare a sample image to upload to the API.\n",
         "\n",
         "Note: You can also [upload your own files](https://github.com/google-gemini/cookbook/tree/main/examples/Upload_files.ipynb) to use."
       ]
@@ -196,7 +185,7 @@
       ],
       "source": [
         "!curl -o image.jpg \"https://storage.googleapis.com/generativeai-downloads/images/jetpack.jpg\"\n",
-        "Image(filename='image.jpg')"
+        "Image(filename=\"image.jpg\")"
       ]
     },
     {
@@ -205,7 +194,7 @@
         "id": "EEoXN0f3N2yc"
       },
       "source": [
-        "Next, we'll upload that file to the File API."
+        "Now upload that file to the File API."
       ]
     },
     {
@@ -219,13 +208,12 @@
           "name": "stdout",
           "output_type": "stream",
           "text": [
-            "Uploaded file 'Sample drawing' as: https://generativelanguage.googleapis.com/v1beta/files/p0dsmt12b68\n"
+            "Uploaded file '' as: https://generativelanguage.googleapis.com/v1beta/files/p0dsmt12b68\n"
           ]
         }
       ],
       "source": [
-        "sample_file = genai.upload_file(path=\"image.jpg\",\n",
-        "                                display_name=\"Sample drawing\")\n",
+        "sample_file = genai.upload_file(path=\"image.jpg\", display_name=\"Sample drawing\")\n",
         "\n",
         "print(f\"Uploaded file '{sample_file.display_name}' as: {sample_file.uri}\")"
       ]
@@ -282,7 +270,9 @@
       "source": [
         "## Generate content\n",
         "\n",
-        "After uploading the file, you can make `GenerateContent` requests that reference the File API URI. In this example, we create prompt that starts with a text followed by the uploaded image."
+        "After uploading the file, you can make `GenerateContent` requests that reference the file by providing the URI. In the Python SDK you can pass the returned object directly.\n",
+        "\n",
+        "Here you create a prompt that starts with text and includes the uploaded image."
       ]
     },
     {
@@ -303,7 +293,9 @@
       "source": [
         "model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
         "\n",
-        "response = model.generate_content([\"Describe the image with a creative description.\", sample_file])\n",
+        "response = model.generate_content(\n",
+        "    [\"Describe the image with a creative description.\", sample_file]\n",
+        ")\n",
         "\n",
         "print(response.text)"
       ]
@@ -314,7 +306,7 @@
         "id": "IrPDYdQSKTg4"
       },
       "source": [
-        "## Delete Files\n",
+        "## Delete files\n",
         "\n",
         "Files are automatically deleted after 2 days or you can manually delete them using `files.delete()`."
       ]
@@ -336,7 +328,137 @@
       ],
       "source": [
         "genai.delete_file(sample_file.name)\n",
-        "print(f'Deleted {sample_file.display_name}.')"
+        "print(f\"Deleted {sample_file.display_name}.\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "u_aF5anOvKsO"
+      },
+      "source": [
+        "## Supported text types\n",
+        "\n",
+        "As well as supporting media uploads, the File API can be used to embed text files, such as Python code, or Markdown files, into your prompts.\n",
+        "\n",
+        "This example shows you how to load a markdown file into a prompt using the File API."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "3Hz37jFBSr9l"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "## Steps to Take Before Contributing to the Gemini API Cookbook:\n",
+            "\n",
+            "Here's what you should do before you begin writing:\n",
+            "\n",
+            "**1. Contributor License Agreement (CLA):**\n",
+            "\n",
+            "*  Visit https://cla.developers.google.com/ to check if you or your employer have already signed the Google CLA. If not, you'll need to sign one to allow the project to use and redistribute your contributions.\n",
+            "\n",
+            "**2. Familiarize Yourself with Style Guides:**\n",
+            "\n",
+            "*  Read the highlights of the technical writing style guide: https://developers.google.com/style/highlights \n",
+            "*  Review the style guide for the programming language you'll be using: https://google.github.io/styleguide/\n",
+            "\n",
+            "**3. Consider Using pyink (for Python notebooks):**\n",
+            "\n",
+            "*  While not mandatory, running `pyink` on your *.ipynb files can help maintain consistent style and avoid potential issues.\n",
+            "\n",
+            "**4. Propose Your Contribution:**\n",
+            "\n",
+            "*  Before writing anything, create an issue on the GitHub repository (https://github.com/google-gemini/cookbook/issues) to discuss your idea and receive guidance on structuring your content. This helps ensure your contribution aligns with the project's goals and avoids wasted effort.\n",
+            "\n",
+            "**5. Understand the Evaluation Criteria:**\n",
+            "\n",
+            "*  The project considers factors like originality, pedagogical value, and quality when accepting new guides. Aim to make your contribution as strong as possible in these areas. \n",
+            "\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Download a markdown file and ask a question.\n",
+        "\n",
+        "!curl -so contrib.md https://raw.githubusercontent.com/google-gemini/cookbook/main/CONTRIBUTING.md\n",
+        "\n",
+        "md_file = genai.upload_file(path=\"contrib.md\", display_name=\"Contributors guide\")\n",
+        "\n",
+        "model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
+        "response = model.generate_content(\n",
+        "    [\n",
+        "        \"What should I do before I start writing, when following these guidelines?\",\n",
+        "        md_file,\n",
+        "    ]\n",
+        ")\n",
+        "print(response.text)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "pmmVaBz4Ss3W"
+      },
+      "source": [
+        "Some common text formats are automatically detected, such as `text/x-python`, `text/html` and `text/markdown`. If you are using a file that you know is text, but is not automatically detected by the API as such, you can specify the MIME type as `text/plain` explicitly."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "8m4qpfTqzE9o"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "## Program Breakdown: Gemma Language Model Example\n",
+            "\n",
+            "This C++ program demonstrates how to use the Gemma language model for text generation. Let's break down what each part does:\n",
+            "\n",
+            "**1. Headers and Setup:**\n",
+            "\n",
+            "*   Includes necessary libraries like `iostream` for input/output, `gemma.h` for the Gemma model, and others for thread management and argument parsing.\n",
+            "*   Defines a `tokenize` function that prepares the input prompt string by adding specific start/end tokens and converting it into a sequence of integer tokens using the provided tokenizer.\n",
+            "\n",
+            "**2. Main Function:**\n",
+            "\n",
+            "*   **Argument Parsing:** Uses `LoaderArgs` to parse command-line arguments related to the model, tokenizer, weights, and other settings.\n",
+            "*   **Thread Pool Creation:** Creates a thread pool based on the available hardware concurrency for efficient parallel processing.\n",
+            "*   **Model and Cache Initialization:**\n",
+            "    *   Loads the Gemma model using the specified tokenizer and weights.\n",
+            "    *   Creates a Key-Value (KV) cache, which is used for caching intermediate results during generation. \n",
+            "*   **Random Number Generator:** Sets up a random number generator using `std::mt19937` for stochastic aspects of the model.\n",
+            "*   **Prompt Tokenization:** Tokenizes the example instruction \"Write a greeting to the world.\" using the `tokenize` function. \n",
+            "*   **Stream Token Callback:** Defines a callback function `stream_token` that is called for each generated token. It keeps track of the generation progress and prints the generated text.\n",
+            "*   **Text Generation:** Calls the `GenerateGemma` function to generate text based on the provided prompt, model, KV cache, and various parameters like maximum token limits and temperature. The `stream_token` callback is used to process each generated token.\n",
+            "*   **Output:** Prints the final generated text to the console. \n",
+            "\n",
+            "**In essence, this program takes an instruction as input, uses the Gemma language model to generate text based on that instruction, and then outputs the generated text to the user.** \n",
+            "\n"
+          ]
+        }
+      ],
+      "source": [
+        "# Download some C++ code and force the MIME as text when uploading.\n",
+        "\n",
+        "!curl -so gemma.cpp https://raw.githubusercontent.com/google/gemma.cpp/main/examples/hello_world/run.cc\n",
+        "\n",
+        "cpp_file = genai.upload_file(\n",
+        "    path=\"gemma.cpp\", display_name=\"gemma.cpp\", mime_type=\"text/plain\"\n",
+        ")\n",
+        "\n",
+        "model = genai.GenerativeModel(model_name=\"models/gemini-1.5-pro-latest\")\n",
+        "response = model.generate_content([\"What does this program do?\", cpp_file])\n",
+        "print(response.text)"
       ]
     }
   ],