PipeGPTPipeGPT

An intuitive prompt-chaining tool that enables businesses & prompt engineers to create.

PipeGPT is a design tool that helps businesses and prompt engineers create and test prompt chains, or “pipes”. Integrate external data sources and APIs into your pipes, supercharging the prompt creation process. Enforce and test data requirements with schema validation at each node of the prompt chain.

Problem Statement

Why is PipeGPT needed?

“Prompt Engineering” and Large Language Models (LLMs) are new concepts for most people and businesses, and as such there are few tools available.

If you're a business looking to use LLMs for document processing, you might create a prompt to extract necessary data in the right format for your backend systems. However, when testing the prompt on similar documents, the model may not consistently provide answers in the same data type, format, or with the same accuracy. This issue makes it challenging for businesses to use LLMs in automated solutions. What are some factors that affect the consistency of the output?

  • Task complexity Complicated prompts that involve many steps/tasks can cause LLMs to hallucinate and generate incorrect answers. In general, this is mainly an input problem caused by human error and ambiguity.
  • Temperature Also known as the parameter that affects the "determinism" of the prompt, temperature controls the randomness of the output. A temperature of 0 will produce the same output every time for a given input.
  • Available Knowledge LLMs only have access to the knowledge they were initially trained on, and the "new-to-model" information you provide in the prompt. If the prompt doesn't provide enough information, the model may hallucinate a response.
  • Context/Token Length The number of tokens (characters/words) an LLM can accept as input and use in the response is defined by the token limit. If you're providing new-to-model information in the prompt, the model may not have enough remaining tokens to generate a consistent response.

These factors all contribute to the inconsistency of the generated response, and must be independantly controlled by the user to ensure consistency. These are the problems that PipeGPT aims to solve.

How does PipeGPT address these problems?

PipeGPT addresses these problems and MORE with:

Structured Prompt Chaining

PipeGPT utilizes in-prompt schemas that define the request & expected response variables, their data types, and their semantic meaning. This schema structures an LLM call in such a way that produces a predictable and unambiguous reponse that can be stored and fed to the next node.

Global Pipe Memory

The variables produced by each node in a pipe are stored within the Global Memory of the pipe. This allows users to access variables created from previous nodes in the pipe, and to select exactly which information they want to supply to the prompt as input.

External APIs

Supercharge Pipes with external services like Pinecone's Vector DB, OpenAI Whisper, and other ML models from MLaaS providers like HuggingFace. Cache the information within the global pipe memory for use in future nodes.

Serverless Pipes

PipeGPT is more than a design tool, it also allows you to deploy your pipes to your own serverless infrastructure like AWS Lambda, Azure Functions, or Google Cloud Functions, or us! Since nodes are API calls, this increases the cost efficiency of Pipe execution.

Redis Cache

A redis cache is used to store the global pipe memory, and to cache the results of external API calls. This reduces the amount of time it takes to execute a pipe on both sides of node execution, when retrieving the information for the request, and when storing the responses.

Auditability

Trace the history of each execution of a Pipe, and see the variables that were created at each node. This allows users to audit what information was used to generate the final output of a Pipe in applications where they are used to automate decision making.

In-tool LLM Fine-Tuning

PipeGPT allows users to fine-tune LLM nodes in the tool itself. When users create an LLM node, they are able to use the validated samples as fine-tuning data to futher improve the performance of the LLM node.

Sample Validation

Iterate rapidly! When users create a Pipe, they can execute the samples they have loaded into the input node to quickly validate Pipe execution and the individual nodes. Upload and validate using as many samples as you like.

Cleanse Input Data

Don't worry about formatting input data before sending it to a Pipe. Using the input node, define how you would like data be cleansed before being passed to the first node. This adds one more layer of confidence to the execution of nodes in a Pipe