Project title: Arduchat
In developing modern IoT projects, integrating the power of Generative AI with Embedded Systems is a crucial step that transforms ordinary devices into intelligent interfaces. This article will delve into the workings of Arduino code specifically designed to connect with the OpenAI API, particularly the GPT (Generative Pre-trained Transformer) models, to create an intelligent interactive system via an OLED screen that can process human language astonishingly.
Software Architecture and Libraries Used
The structure of this project is not merely about writing code to send general messages; it's a system design that relies on the collaboration of several high-level libraries to manage hardware complexity and highly secure data communication:
- Wire & Adafruit SSD1306/GFX: Serve as the core for driving the highly popular 0.96-inch OLED display module. Communication occurs via the I2C (Inter-Integrated Circuit) protocol, using only two signal lines: SDA and SCL. The
Adafruit_GFXlibrary manages the font library and graphics drawing algorithms, whileAdafruit_SSD1306sends register-level control commands to the driver chip to manage all 8,192 pixels (128x64) on the screen. - WiFi101: Designed for WINC1500 family WiFi modules (such as those used in the Arduino MKR series). It manages the network connection layer, from associating with an Access Point to handling the TCP/IP Stack to create a data pipeline between the board and the outside world.
- ArduinoBearSSL: This is the most crucial part in terms of security engineering. Since the OpenAI API enforces HTTPS (port 443) protocol exclusively, performing an SSL/TLS Handshake on a resource-constrained microcontroller is very challenging. The BearSSL library enables the system to perform Cryptography calculations and manage Certificates to verify that data exchanged with the server is not intercepted or modified.
- ArduinoJson: In the world of APIs, data is exchanged in JSON (JavaScript Object Notation) format. This library allows us to perform Serialization (converting objects in code into JSON text for export) and Deserialization (unpacking received JSON text back into usable variables) quickly and efficiently in terms of memory.
System Startup and Network Management (Initialization)
When power is supplied to the system, the program begins by initializing the OLED screen to provide real-time status updates to the user. It then proceeds to the WiFi connection step. An interesting, often overlooked, subsequent step is retrieving the current time via the NTP (Network Time Protocol) protocol.
From an engineering perspective, this step is critically important for the operation of BearSSL because verifying the validity of the SSL Certificate from OpenAI requires knowing the "exact date and time" to confirm that the certificate has not expired (Not After) and is already active (Not Before). If the board's internal clock does not match reality, the Handshake will fail, and the board will be unable to connect to the API at all.
Main Loop Mechanism and OpenAI API Calls
In the main Loop, the program is designed to operate as a State Machine to send questions to OpenAI, with the following interesting technical details:
- Creating a Secure Session: The program creates a
WiFiSSLClientobject, which acts as an encrypted tunnel directly toapi.openai.com. - HTTP POST Request Structure: Calling GPT requires using the POST method, with the following important Headers:
Content-Type: application/jsonto specify the data format.Authorization: Bearer [API_KEY], which is the key to authentication and billing against your account.
- Payload Management with ArduinoJson: The code prepares the data structure in memory before sending it out.
// Static memory management to prevent Heap Fragmentation
StaticJsonDocument<512> doc;
doc["model"] = "gpt-3.5-turbo"; // or text-davinci-003 according to settings
doc["messages"][0]["role"] = "user";
doc["messages"][0]["content"] = "What is Arduino?";
doc["max_tokens"] = 50; // Limit Token count to save Bandwidth and cost
// Send serialized data directly through the Client
serializeJson(doc, client);
The reason for choosing StaticJsonDocument over DynamicJsonDocument is that when operating on microcontrollers with limited RAM, pre-allocating space on the Stack helps prevent memory fragmentation issues when the program runs for extended periods.
Response Parsing
After the OpenAI server finishes processing, it sends back a large block of JSON text. The system's task is to "skip" the HTTP Header section and proceed directly to Parse the JSON content.
ArduinoJson will filter the data and access Object layers, such as doc["choices"][0]["message"]["content"], to extract only the answer text. Once the text is obtained as a String, the program calls display.clearDisplay() to clear the old screen and starts writing the new text with display.println(). The GFX system automatically performs Word Wrap to ensure the text fits within the 128-pixel width of the screen.
Engineering Summary
This project is a clear example of Edge-to-Cloud Integration, demonstrating that even devices with low processing power (Low-power MCU) can access the capabilities of Large Language Models (LLM). The true challenge is not merely writing functional code, but managing the Memory Footprint and ensuring data security via SSL. The choice of well-optimized libraries like BearSSL and ArduinoJson is therefore crucial for making the system stable and Production Ready.