01.AI API Platform

Get Started

Introduction

The 01.AI API platform enables developers to integrate advanced natural language processing capabilities into their own applications. Developers can utilize the AI capabilities of the Yi series LLMs to perform a variety of tasks, such as text generation, language translation, content summarization, logical reasoning, mathematical calculation, and code generation.

The 01.AI API platform provides flexible calling methods, supports various programming languages, and can be customized with features to meet the needs of different scenarios. Both individual and enterprise developers can unlock new approaches to innovate, improve user experience and drive business growth.

In addition, the 01.AI API platform also provides detailed documentation and sample code to help developers quickly get started and utilize the capabilities of the Yi series LLMs effectively.

Advantages

  • OpenAI Compatible: 01.AI API is highly compatible with OpenAI API, developers only need minimal code change to achieve smooth migration.
  • Fast inference speed: In order to improve API performance, we keep optimizing the inference stack. Our APIs offer faster inference, which not only reduces processing time, but also keeps excellent response quality. In addition, the optimized API interface significantly reduces the latency of model responses, further improving the smoothness and user experience.
  • A Diverse Model Matrix: For different business scenarios, 01.AI API platform provides diverse LLMs with various capabilities, parameter sizes, prices, and context windows sizes. No matter the type of application, 01.AI API platform serves all you need.

Quickstart

Create an API key

  • You can click API Keys to manage your API keys.
  • Please do not share your API key with others, or expose it in the browser or other client-side code. In order to protect the security of your account, 01.AI may also automatically disable any API key that has leaked publicly.

Making an API request

HTTP request
1curl https://api.01.ai/v1/chat/completions \
2  -H "Content-Type: application/json" \
3  -H "Authorization: Bearer $API_KEY" \
4  -d '{
5    "model": "yi-large",
6    "messages": [{"role": "user", "content": "Hi, who are you?"}],
7    "temperature": 0.3
8  }'
9
Using SDK

The API interface is compatible with OpenAI's Python SDK and can be used with a simple configuration.

To install the OpenAI SDK, make sure you are using Python version 3.7.1 at minimum and that the OpenAI SDK version is 1.0.0 at minimum.

1pip install openai
2
Sample Request
1import openai
2from openai import OpenAI
3API_BASE = "https://api.01.ai/v1"
4API_KEY = "your key"
5client = OpenAI(
6    api_key=API_KEY,
7    base_url=API_BASE
8)
9completion = client.chat.completions.create(
10    model="yi-large",
11    messages=[{"role": "user", "content": "Hi, who are you?"}]
12)
13print(completion)
14

Models and Pricing

The 01.AI API Platform offers a range of Yi series LLMs with different features and pricing. Yi series LLMs are a family of Large Language Models developed by 01.AI. Yi series LLMs perform well in a wide range of tasks, such as chat completion, text generation, logical reasoning, mathematical computation, and code writing, etc. The Yi series LLMs are powerful and easy to use. You can choose a superior model or service based on actual needs.

ModelsContext Window SizeFeaturesApplicationsPrice per 1M input tokenPrice per 1M output token
yi-large
32KThe brand new STATE-OF-THE-ART model offering outstanding chat and text generation.The expert in complex inference, prediction and natural language understanding scenario.$3$3
yi-large-turbo
4KHigh Performance and Cost-Effectiveness model offering powerful capabilities at a competitive price.Ideal for a wide range of scenarios, including complex inference and high-quality text generation.$0.19$0.19
yi-large-fc
32KSpecialized model with capability of tool use. The model can decide whether to call the tool based on the tool definition passed in by the user, and the calling method will be generate in the specified format.Applicable to various production scenarios that require building agents or workflows.$3$3
yi-vision
16KComplex visual task models provide high-performance understanding and analysis capabilities based on multiple images.Ideal for scenarios that require analysis and interpretation of images and charts, such as image question answering, chart understanding, OCR, visual reasoning, education, research report understanding, or multilingual document reading.$0.19$0.19

Endpoints

Create chat completion

Overview

Given a list of messages comprising a conversation, the model will return a response.

POST https://api.01.ai/v1/chat/completions

Request body

Table 1:General Fields

FrameFieldsTypeRequiredDescriptionDefault ValueExample Value
HeaderContent-TypestringYesType of content.N/Aapplication/json
HeaderAuthorizationstringYesAPI Key.N/Ayour-api-key
BodymodelstringYesID of the model to use.N/Ayi-large
BodymessagesarrayYesA list of messages comprising the conversation so far.N/APlease refer to「Table 2: Messages Properties
BodytoolsarrayNoA list of tools that can be called by the model. Currently only functions are supported as tools.N/APlease refer to 「Tool use
Bodytool_choicestring or objectNoControls whether the model will call a certain tool or tools.N/Aauto
Bodymax_tokensint or nullNoThe maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.N/A1000
Bodytop_pfloatNoUse nucleus sampling. In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both.0.9Ranges between 0 and 1.
BodytemperaturefloatNoAmount of randomness injected into the response. Use temperature closer to 0.0 for analytical / multiple choice, and larger value for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.0.3Ranges between 0 and 2.
BodystreambooleanNoIf set, partial message deltas will be sent.falsefalse
Table 2: Messages Properties
FrameFieldsTypeRequiredDescriptionObjectDefault ValueExample Value
BodymessagesList <message>YesA list of messages comprising the conversation so far.

System message

  • Content (string|required): The contents of the system message.
  • Role (string | required): The role of the messages author, in this case system.
N/A
1 { 
2 "content" : "you are a robot",
3 "role": "system"
4 }

User message

  • Content(string| required):The contents of the user message.
  • Role(string | required):The role of the messages author, in this case user.
N/A
1 { 
2 "content" : "hello",
3 "role": "user"
4 }

Assistant message

  • Content(string | required): The contents of the assistant message.
  • Role(string required): The role of the messages author, in this case assistant.
N/A
1 { 
2 "content" : "hello, i'm a robot",
3 "role": "assistant"
4 }

The chat completion object

PropertiesTypeSub-propertiesDescriptionExample Value
idstringN/AA unique identifier for the chat completion.cmpl-1cfbad15
objectstringN/AThe object type, which is always chat.completion.chat.completion
createdlongN/AThe Unix timestamp (in seconds) of when the chat completion was created.1178759
modelstringN/AThe model used for the chat completion.yi-large
choicesarray <choice>indexThe index of the choice in the list of choices, starting from 0. 0
messagesPlease refer to「Table 2: Messages PropertiesPlease refer to「Table 2: Messages Properties
finish_reasonThe reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence; length if the maximum number of tokens specified in the request was reached; content_filter if content was omitted due to a flag from our content filters.stop
usagearray <usage>completion_tokensNumber of tokens in the generated completion.48
prompt_tokensNumber of tokens in the prompt.18
total_tokensTotal number of tokens used in the request (prompt + completion).66

List models

Overview

List models available in the API.

GET https://api.01.ai/v1/models

Response body

PropertiesTypeDescriptonExample Value
idStringThe model ID that is available.yi-large
objectStringType of the object,in this case is always model.model
createdLongThe Unix timestamp (in seconds) of when the message was created.1178759
ownedByStringOwner of the model, in this case is always 01.ai.01.ai

Example Request and Response

HTTP Example
  • Request
1curl --location 'https://api.01.ai/v1/models' \
2  --header "Authorization: Bearer $API_KEY"
3
  • Response
1{
2  "data": [
3    {
4      "id": "yi-large",
5      "object": "model",
6      "created": 1708258504,
7      "ownedBy": "01.ai",
8      "root": "",
9      "parent": ""
10    }
11  ],
12  "object": "list"
13}
14
SDK Example
  • Request
1import openai
2from openai import OpenAI
3API_BASE = 'https://api.01.ai/v1'
4API_KEY = "your key"
5client = OpenAI(
6  api_key=API_KEY,
7  base_url=API_BASE,
8  timeout=300
9)
10models = client.models.list()
11print(models)
12
  • Response
1SyncPage[Model](
2  data=[
3    Model(
4      id='yi-large',
5      created=1708671653,
6      object='model',
7      owned_by=None,
8      ownedBy='01.ai',
9      root='',
10      parent=''
11    )
12  ],
13  object='list'
14)
15

Features

Chat Completions

Example Request and Reponse

Non-streaming Mode
HTTP Request
  • Request
1curl https://api.01.ai/v1/chat/completions \
2  -H "Content-Type: application/json" \
3  -H "Authorization: Bearer $API_KEY" \
4  -d '{
5    "model": "yi-large",
6    "messages": [{"role": "user", "content": "Hi, who are you?"}],
7    "temperature": 0.3
8  }'
9
  • Response
1{
2  "id": "cmpl-c730301f",
3  "object": "chat.completion",
4  "created": 7825887,
5  "model": "yi-large",
6  "usage": {
7    "completion_tokens": 65,
8    "prompt_tokens": 15,
9    "total_tokens": 80
10  },
11  "choices": [
12    {
13      "index": 0,
14      "message": {
15        "role": "assistant",
16        "content": "Hello! My name is Yi, and I am a language model based on the transformers architecture developed by 01.AI. My purpose is to be a helpful resource for you, capable of answering questions and offering insightful information across a wide range of topics. How may I be of service to you today?"
17      },
18      "finish_reason": "stop"
19    }
20  ]
21}
22
SDK Example
  • Request
1import openai
2from openai import OpenAI
3API_BASE = "https://api.01.ai/v1"
4API_KEY = "your key"
5client = OpenAI(
6  api_key=API_KEY,
7  base_url=API_BASE
8)
9completion = client.chat.completions.create(
10  model="yi-large",
11  messages=[{"role": "user", "content": "Hi, who are you?"}]
12)
13print(completion)
14
  • Response
1ChatCompletion(id = 'cmpl-8062fda5',
2 choices = [
3    Choice(
4      finish_reason = 'stop',
5      index = 0,
6      logprobs = None,
7      message = ChatCompletionMessage(
8        content = 'Hello! My name is Yi, and I am a language model based on the transformers architecture developed by 01.AI.My purpose is to be a helpful resource for you, capable of answering questions and offering insightful information across a wide range of topics.How may I be of service to you today ? ',
9        role = 'assistant',
10        function_call = None,
11        tool_calls = None
12      )
13    )
14  ],
15  created = 7826404,
16  model = 'yi-large',
17  object = 'chat.completion',
18  system_fingerprint = None,
19  usage = CompletionUsage(
20    completion_tokens = 65,
21    prompt_tokens = 15,
22    total_tokens = 80
23  )
24)
25
Streaming Mode
HTTP Request
  • Request
1curl https://api.01.ai/v1/chat/completions \
2  -H "Content-Type: application/json" \
3  -H "Authorization: Bearer $API_KEY" \
4  -d '{
5    "model": "yi-large",
6    "messages": [{"role": "user", "content": "Hi, who are you?"}],
7    "temperature": 0.3,
8    "stream": true
9  }'
10
  • Response
1data: {"id":"cmpl78796a05","object":"chat.completion.chunk","created":7828777,"model":"yi-large","choices":[{"delta":{"role":"assistant"},"index":0}],"content":"","lastOne":false}
2data: {"id":"cmpl78796a05","object":"chat.completion.chunk","created":7828777,"model":"yi-large","choices":[{"delta":{"content":"Hello"},"index":0}],"content":"Hello","lastOne":false}
3...
4data: {"id":"cmpl78796a05","object":"chat.completion.chunk","created":7828777,"model":"yi-large","choices":[{"delta":{},"index":0,"finish_reason":"stop"}],"content":"Hello! My name is Yi, and I am a language model based on the transformers architecture developed by 01.AI. My purpose is to be a helpful resource for you, capable of answering questions and offering insightful information across a wide range of topics. How may I be of service to you today?","usage":{"completion_tokens":64,"prompt_tokens":17,"total_tokens":81},"lastOne":true}
5data: [DONE]
6
SDK Example
  • Request
1import openai
2from openai import OpenAI
3API_BASE = "https://api.01.ai/v1"
4API_KEY = "your key"
5client = OpenAI(
6  api_key=API_KEY,
7  base_url=API_BASE
8)
9completion = client.chat.completions.create(
10  model="yi-large",
11  messages=[{"role": "user", "content": "Hi, who are you?"}],
12  stream=True
13)
14for chunk in completion:
15  print(chunk.choices[0].delta.content or "", end="", flush=True)
16
  • Response
Hello! My name is Yi, and I am a language model based on the transformers architecture developed by 01.AI. My purpose is to be a helpful resource for you, capable of answering questions and offering insightful information across a wide range of topics. How may I be of service to you today?

Tool Use

01.AI API endpoints support tool use for programmatic execution of specified operations through requests with explicitly defined operations. With tool use, yi-large-fc model deliver structured JSON output that can be used to directly invoke functions from desired codebases.

Example Request and Response

According to the table below, calculate the MoM and YoY data of April 2024:

MonthRevenue
2023-0125.78
2023-0220.23
2023-0321.27
2023-0420.96
2023-0524.33
2023-0622.51
2023-0723.97
2023-0825.86
2023-0927.05
2023-1028.23
2023-1127.17
2023-1228.88
2024-0126.33
2024-0225.25
2024-0328.89
2024-0429.12
2024-0530.08
2024-0629.75
2024-0728.83
1import openai
2import json
3from openai import OpenAI
4
5API_BASE = "https://api.01.ai/v1"
6API_KEY = "YOUR_API_KEY"
7MODEL = "yi-large-fc"
8
9client = OpenAI(
10    # defaults to os.environ.get("OPENAI_API_KEY")
11    api_key=API_KEY,
12    base_url=API_BASE
13)
14# Example dummy function hard coded to return the same weather
15# In production, this could be your backend API or an external API
16def calculator( time, type, pre_value, current_value):
17    """Calculate YoY and MoM changes"""
18    a = float(pre_value)
19    b = float(current_value)
20    c = (b - a)/a
21    return json.dumps({"time": time, "type": type, "change": "%.2f%%" % (c * 100)})
22
23def run_conversation():
24    # Step 1: send the conversation and available functions to the model
25    messages = [{"role": "user", "content": "According to the table below, calculate the MoM and YoY data of April 2024: \n| Month     | Revenue  | \n|---------|---------|\n| 2023-01 |   25.78   |\n| 2023-02 |   20.23   |\n| 2023-03 |   21.27   |\n| 2023-04 |   20.96   |\n| 2023-05 |   24.33   |\n| 2023-06 |   22.51   |\n| 2023-07 |   23.97   |\n| 2023-08 |   25.86   |\n| 2023-09 |   27.05   |\n| 2023-10 |   28.23   |\n| 2023-11 |   27.17   |\n| 2023-12 |   28.88   |\n| 2024-01 |   26.33   |\n| 2024-02 |   25.25   |\n| 2024-03 |   28.89   |\n| 2024-04 |   29.12   |\n| 2024-05 |   30.08   |\n| 2024-06 |   29.75   |\n| 2024-07 |   28.83   |"}]
26    tools = [
27      {
28        "type": "function",
29        "function": {
30            "name": "calculator",
31            "description": "YoY and MoM data calculator. YoY means year over year changes and MoM means month over month changes.",
32            "parameters": {
33                "type": "object",
34                "properties": {
35                    "time": {
36                        "type": "string",
37                        "description": "date, month, or year information"
38                    },
39                    "type": {
40                        "type": "string",
41                        "enum": ["YoY","MoM"]
42                    },
43                    "pre_value": {
44                        "type": "string",
45                        "description": "previous value being compared"
46                    },
47                    "current_value": {
48                        "type": "string",
49                        "description": "current data for comparison"
50                    }
51                },
52                "required": ["time", "type","pre_value","current_value"]
53            }
54        } 
55      }
56    ]
57    response = client.chat.completions.create(
58        model=MODEL,
59        messages=messages,
60        tools=tools,
61        tool_choice="auto",
62    )
63    response_message = response.choices[0].message
64    tool_calls = response_message.tool_calls
65    print(response_message)
66    # Step 2: check if the model wanted to call a function
67    if tool_calls:
68        # Step 3: call the function
69        print("...")
70        # Note: the JSON response may not always be valid; be sure to handle errors
71        available_functions = {
72            "calculator": calculator,
73        }  # only one function in this example, but you can have multiple
74        messages.append(response_message)  # extend conversation with assistant's reply
75        
76        # Step 4: send the info for each function call and function response to the model
77        for tool_call in tool_calls:
78            function_name = tool_call.function.name
79            function_to_call = available_functions[function_name]
80            function_args = json.loads(tool_call.function.arguments)
81            function_response = function_to_call(
82                type=function_args.get("type"),
83                pre_value=function_args.get("pre_value"),
84                current_value=function_args.get("current_value"),
85                time=function_args.get("time")
86            )
87            messages.append(
88                {
89                    "tool_call_id": tool_call.id,
90                    "role": "tool",
91                    "name": function_name,
92                    "content": function_response,
93                }
94            )  # extend conversation with function response
95
96        second_response = client.chat.completions.create(
97            model=MODEL,
98            messages=messages,
99        )  # get a new response from the model where it can see the function response
100        return second_response
101print(run_conversation())
102

How Tool Use Works

  1. Provide tools and a user prompt in your API request.
  2. The model decides whether to use a tool and constructs a properly formatted tool use request.
  3. You extract tool input, execute the tool code, and return results.
  4. The model uses the tool result to formulate a response to the original prompt.

Tools Specifications

  • tools: an array with each element representing a tool. Currently only supports function.
    • type: a string indicating the category of the tool
    • function: an object that includes:
      • description | optional: a string that describes the function's purpose, use this to guide the model on when and how the function should be used
      • name | required: a string serving as the function's identifier
      • parameters | optional: a JSON Schema object that defines the parameters the function accepts

Tool Choice

The tool_choice parameter controls whether the model can use tools and which tools it may call. This parameter enhances the flexibility of model interaction based on specific use-case requirements.

tool_choice needs to either be of type string or of type object.

Valid string values:

  • none: The model does not invoke any functions and will only generate text responses. This is the default setting when no tools are provided.
  • auto: Allows the model to choose between generating a text response or calling a function. This is the default setting when functions are available.
  • required: Forces the model to call a function.

Valid object values:

  • To explicitly require the model to call a particular function, the parameter should be structured as an object specifying the function name. For example, to call a function named get_financial_data, the parameter tool_choice should be set to {"type": "function", "function": {"name": "get_weather"}}.
  • Only the function value is supported for the type property.
  • The function property should contain an object with only the name property set to the string value of the name of the function you'd like the model to use.
  • This configuration constrains the model to use only the specified function during the completion request. However, the model may still use tools based at its own discretion.

Best Practices

  • Provide detailed tool descriptions for optimal performance
  • However, please note that this capability also brings certain potential risks. We strongly recommend that users use the tool calling function only in scenarios or workflows that clearly require tool calling.

Visual understanding Task

Non-streaming Mode

The following figure is used for the code example. The company name and data in the figure are fictitious.

HTTP Request

1curl --location 'https://api.01.ai/v1/chat/completions' \
2  --header 'Authorization: Bearer $API_KEY' \
3  --header 'Content-Type: application/json' \
4  --data '{
5    "model": "yi-vision",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "image_url",
12            "image_url": {
13              "url": "https://platform.01.ai/assets/sample-table.jpg"
14            }
15          },
16          {
17            "type": "text",
18            "text": "Please describe this picture in detail."
19          }
20        ]
21      }
22    ],
23    "stream": false,
24    "max_tokens": 1024
25  }'
26

Response

1{
2  "created": 0,
3  "model": "yi-vision",
4  "choices": [
5    {
6      "index": 0,
7      "message": {
8        "role": "assistant",
9        "content": "The image is a table with a green and white color scheme, listing the top 10 companies in a particular ranking. The table has six columns titled "Rank," "Rank Change," "Company Name," "Value (Billion Dollar)," "Value Change," "City," and "Industry." Each row corresponds to a different company, with the rank numbering from 1 to 10. The first column shows the rank of the company, with the first company being ranked number 1. The second column shows the rank change, with phrases like "Unchanged," "Up 1," "Down 1," "Up 15," etc. The third column lists the company names, such as "Infinity Innovations Inc.," "Space Dynamics Corp.," and "Quantum Solutions Ltd." The fourth column lists the company values in billions of dollars, with the highest value being 138 billion. The fifth column shows the value change, with both positive and negative numbers, and one entry marked with a slash, indicating no change. The sixth column lists the cities where the companies are located, including "Los Angeles," "New York," and "London." The last column lists the industries the companies belong to, such as "FinTech," "Social Media," "Big Data," "Aerospace," and "Logistics." The table has a light background, and the columns are separated by thicker lines, while the rows are separated by thinner lines."
10      },
11      "finish_reason": "stop"
12    }
13  ]
14}
15

SDK Example

1from openai import OpenAI
2import base64
3import httpx
4API_BASE = "https://api.01.ai/v1"
5API_KEY = "your-key-here"
6client = OpenAI(
7  api_key=API_KEY,
8  base_url=API_BASE
9)
10# Approach 1: use image URL
11image = "https://platform.01.ai/assets/sample-table.jpg"
12# Approach 2: use image URL and encode it to base64
13# image_url = "https://platform.01.ai/assets/sample-table.jpg"
14# image_media_type = "image/jpeg"
15# image = "data:image/jpeg;base64," + base64.b64encode(httpx.get(image_url).content).decode("utf-8")
16# Approach 3: use local image and encode it to base64
17# image_path = "./yidemo.jpeg"
18# with open(image_path, "rb") as image_file:
19# image = "data:image/jpeg;base64," + base64.b64encode(image_file.read()).decode('utf-8')
20# Make a request, can be multi round
21completion = client.chat.completions.create(
22  model="yi-vision",
23  messages=[
24    {
25      "role": "user",
26      "content": [
27        {
28          "type": "text",
29          "text": "Please describe this picture in detail."
30        },
31        {
32          "type": "image_url",
33          "image_url": {
34            "url": image
35          }
36        }
37      ]
38    },
39    # multi round (optional)
40    #   {
41    #       "role": "assistant",
42    #       "content": "The image is a table with a green and white color scheme, listing the top 10 companies in a particular ranking. The table has six columns titled "Rank," "Rank Change," "Company Name," "Value (Billion Dollar)," "Value Change," "City," and "Industry." Each row corresponds to a different company, with the rank numbering from 1 to 10. The first column shows the rank of the company, with the first company being ranked number 1. The second column shows the rank change, with phrases like "Unchanged," "Up 1," "Down 1," "Up 15," etc. The third column lists the company names, such as "Infinity Innovations Inc.," "Space Dynamics Corp.," and "Quantum Solutions Ltd." The fourth column lists the company values in billions of dollars, with the highest value being 138 billion. The fifth column shows the value change, with both positive and negative numbers, and one entry marked with a slash, indicating no change. The sixth column lists the cities where the companies are located, including "Los Angeles," "New York," and "London." The last column lists the industries the companies belong to, such as "FinTech," "Social Media," "Big Data," "Aerospace," and "Logistics." The table has a light background, and the columns are separated by thicker lines, while the rows are separated by thinner lines."
43    #   },
44    #   {
45    #       "role": "user",
46    #       "content": [
47    #           {
48    #           "type": "text",
49    #           "text": "Which company is ranked second?"
50    #           }
51    #       ]
52    #   }
53  ]
54)
55print(completion)
56

Response

1ChatCompletion(id=None, choices=[Choice(finish_reason='stop', index=0,
2 logprobs=None, message=ChatCompletionMessage(content='The image is a table with a green and white color scheme, listing the top 10 companies in a particular ranking. The table has six columns titled "Rank," "Rank Change," "Company Name," "Value (Billion Dollar)," "Value Change," "City," and "Industry." Each row corresponds to a different company, with the rank numbering from 1 to 10. The first column shows the rank of the company, with the first company being ranked number 1. The second column shows the rank change, with phrases like "Unchanged," "Up 1," "Down 1," "Up 15," etc. The third column lists the company names, such as "Infinity Innovations Inc.," "Space Dynamics Corp.," and "Quantum Solutions Ltd." The fourth column lists the company values in billions of dollars, with the highest value being 138 billion. The fifth column shows the value change, with both positive and negative numbers, and one entry marked with a slash, indicating no change. The sixth column lists the cities where the companies are located, including "Los Angeles," "New York," and "London." The last column lists the industries the companies belong to, such as "FinTech," "Social Media," "Big Data," "Aerospace," and "Logistics." The table has a light background, and the columns are separated by thicker lines, while the rows are separated by thinner lines.', role='assistant', function_call=None, tool_calls=None))], created=0,
3 model='yi-vision', object=None, system_fingerprint=None,
4 usage=CompletionUsage(completion_tokens=104, prompt_tokens=1015,
5 total_tokens=1119))
6

Streaming Mode

The following figure is used for the code example. The company name and data in the figure are fictitious.

HTTP Request

1curl --location 'https://api.01.ai/v1/chat/completions' \
2  --header 'Authorization: Bearer $API_KEY' \
3  --header 'Content-Type: application/json' \
4  --data '{
5    "model": "yi-vision",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "image_url",
12            "image_url": {
13              "url": "https://platform.01.ai/assets/sample-table.jpg"
14            }
15          },
16          {
17            "type": "text",
18            "text": "Please describe this picture in detail."
19          }
20        ]
21      }
22    ],
23    "stream": true,
24    "max_tokens": 1024
25  }'
26

Response

1data: {"id":"cmpl-39d99b40-b958-4a96-b3ea839bb1445c8e","object":"chat.completion.chunk","created":1710234117,"model":"yi-vision","choices":[{"delta":{"content":""},"index":0}],"content":"","lastOne":false}
2data: {"id":"cmpl-39d99b40-b958-4a96-b3ea839bb1445c8e","object":"chat.completion.chunk","created":1710234117,"model":"yi-vision","choices":[{"delta":{"content":"The"}},"index":0}],"content":"The","lastOne":false}
3data: {"id":"cmpl-39d99b40-b958-4a96-b3ea839bb1445c8e","object":"chat.completion.chunk","created":1710234117,"model":"yi-vision","choices":[{"delta":{"content":" image"}},"index":0}],"content":"The image","lastOne":false}
4...
5data: {"id":"cmpl-39d99b40-b958-4a96-b3ea839bb1445c8e","object":"chat.completion.chunk","created":1710234125,"model":"yi-vision","choices":[{"delta":{"content":""},"index":0,"finish_reason":"stop"}],"content":"The image is a table with a green and white color scheme, listing the top 10 companies in a particular ranking. The table has six columns titled "Rank," "Rank Change," "Company Name," "Value (Billion Dollar)," "Value Change," "City," and "Industry." Each row corresponds to a different company, with the rank numbering from 1 to 10. The first column shows the rank of the company, with the first company being ranked number 1. The second column shows the rank change, with phrases like "Unchanged," "Up 1," "Down 1," "Up 15," etc. The third column lists the company names, such as "Infinity Innovations Inc.," "Space Dynamics Corp.," and "Quantum Solutions Ltd." The fourth column lists the company values in billions of dollars, with the highest value being 138 billion. The fifth column shows the value change, with both positive and negative numbers, and one entry marked with a slash, indicating no change. The sixth column lists the cities where the companies are located, including "Los Angeles," "New York," and "London." The last column lists the industries the companies belong to, such as "FinTech," "Social Media," "Big Data," "Aerospace," and "Logistics." The table has a light background, and the columns are separated by thicker lines, while the rows are separated by thinner lines.","usage":{"completion_tokens":120,"prompt_tokens":19,"total_tokens":139},"lastOne":true}
6data: [DONE]
7

SDK Example

1from openai import OpenAI
2import base64
3import httpx
4API_BASE = "https://api.01.ai/v1"
5API_KEY = "your-key-here"
6client = OpenAI(
7  api_key=API_KEY,
8  base_url=API_BASE
9)
10# Approach 1: use image URL
11image = "https://platform.01.ai/assets/sample-table.jpg"
12# Approach 2: use image URL and encode it to base64
13# image_url = "https://platform.01.ai/assets/sample-table.jpg"
14# image_media_type = "image/jpeg"
15# image = "data:image/jpeg;base64," + base64.b64encode(httpx.get(image_url).content).decode("utf-8")
16# Approach 3: use local image and encode it to base64
17# image_path = "./yidemo.jpeg"
18# with open(image_path, "rb") as image_file:
19# image = "data:image/jpeg;base64," + base64.b64encode(image_file.read()).decode('utf-8')
20# Make a request, can be multi round
21stream = client.chat.completions.create(
22  model="yi-vision",
23  messages=[
24    {
25      "role": "user",
26      "content": [
27        {
28          "type": "text",
29          "text": "Please describe this picture in detail."
30        },
31        {
32          "type": "image_url",
33          "image_url": {
34            "url": image
35          }
36        }
37      ]
38    }
39  ],
40  stream=True
41)
42for part in stream:
43  print(part.choices[0].delta.content or "", end="", flush=True)
44

Response

The image is a table with a green and white color scheme, listing the top 10 companies in a particular ranking. The table has six columns titled "Rank," "Rank Change," "Company Name," "Value (Billion Dollar)," "Value Change," "City," and "Industry." Each row corresponds to a different company, with the rank numbering from 1 to 10. The first column shows the rank of the company, with the first company being ranked number 1. The second column shows the rank change, with phrases like "Unchanged," "Up 1," "Down 1," "Up 15," etc. The third column lists the company names, such as "Infinity Innovations Inc.," "Space Dynamics Corp.," and "Quantum Solutions Ltd." The fourth column lists the company values in billions of dollars, with the highest value being 138 billion. The fifth column shows the value change, with both positive and negative numbers, and one entry marked with a slash, indicating no change. The sixth column lists the cities where the companies are located, including "Los Angeles," "New York," and "London." The last column lists the industries the companies belong to, such as "FinTech," "Social Media," "Big Data," "Aerospace," and "Logistics." The table has a light background, and the columns are separated by thicker lines, while the rows are separated by thinner lines.

Error Codes

Error codeError messageCauseSolution
400Bad requestThe model's inputs + outputs (max_tokens) exceed the model's maximum context window length.Reduce the inputs length, or set max_tokens to a smaller value.
Input format error.Check the input format to make sure it is correct. For example, the model name must be all lowercase, yi-large.
401Authentication ErrorAPI Key is missing or invalid.Please make sure your API Key is valid.
404Not foundInvalid Endpoint URL or model ID.Please make sure valid Endpoint URL or model ID has been used.
429Too Many RequestsYou are sending requests too quickly.Please pace your requests.
500Internal Server ErrorThe server had an error while processing your request.Retry your request after a brief wait and contact us if the issue persists.
529System busyOur servers are experiencing high traffic.Please retry your requests after a brief wait.

Usage Tiers and Rate Limits

Why do we have rate limits?

  1. Prevention of abuse: Rate limiting helps prevent the API from being abused or misused. For example, a malicious attack may attempt to overload the API or cause a service outage by sending a high volume of requests in a short amount of time. By setting a rate limit, you can help protect platform users from such attacks.
  2. Fair Access: Rate limits ensure that the API is available and responsive to all users. Without these limits, a small number of users may consume too much resource and degrade the experience of other users. With rate limiting policies configured appropriately according to the actual needs of users, the 01.AI API platform ensures that the majority of users can have the best using experience.
  3. Infrastructure Management: Rate limiting helps manage the overall load on the API infrastructure, which is critical to maintaining service reliability and performance. Especially in the case of a sudden surge in demand, by controlling how often users send requests, API service providers can better manage their resources to avoid performance bottlenecks or service interruptions.

Usage Tiers and Rate Limits

Usage TierRPMTPMSpending limitQualification
Tier Free432,000$0-
Tier 11080,000$10$5 paid
Tier 240120,000$100$50 paid
Tier 3120160,000$200$100 paid
Tier 4120240,000$1,000$500 paid
Tier 5200400,000$2,000$1,000 paid

FAQ

  • Q: How to use 01.AI API with LangChain and LlamaIndex?
    A: To use LangChain,you need import ChatOpenAI from langchain_openai,and config proper value for the api_base, api_key, and model fields.

    1from langchain_openai import ChatOpenAI
    2llm = ChatOpenAI(openai_api_base="https://api.01.ai/v1", openai_api_key="Your-API-key", model="yi-large")
    3
    4print(llm.invoke("hi, who are you?"))
    5

    To use LlamaIndex, you need to import OpenAILike and ChatMessage from llama_index.llms.openai_like and llama_index.core.llms, and config proper value for the api_base, api_key, and model fields.

    1from llama_index.llms.openai_like import OpenAILike
    2from llama_index.core.llms import ChatMessage
    3
    4model = OpenAILike(api_base="https://api.01.ai/v1", api_key="Your-API-key", model="yi-large",  is_chat_model=True)
    5response = model.chat(messages=[ChatMessage(content="Hi, Who are you?")])
    6print(response)
    7