Skip to main content

PL architecture

The PL engine and SQL engine can interact with each other. SQL statements can directly access the PL engine, such as using user-defined functions. The PL engine can access the SQL engine through the SPI interface to perform expression calculations and execute SQL statements.

The interaction between the PL engine and SQL engine is shown in the following figure:

PL1

The PL engine consists of six modules: Parser, Resolver, Code Generator, Compiler, Executor, and PL Cache. The Parser, Resolver, Code Generator, and Compiler modules form a complete PL compilation process. The following figure shows the modules:

PL2

  • Parser (syntax parser)

    The parser analyzes the syntax of PL and generates a parse tree. The PL engine and SQL engine each implement their own parser, but the two parsers avoid redundant work. In seekdb, a query string first enters the PL parser for parsing. If the query string is an SQL statement, it is then parsed by the SQL engine.

  • Resolver (semantic parser)

    The resolver performs semantic analysis, such as checking variable scope and data object schema in static SQL, and generates an abstract syntax tree (AST) for each PL statement and a global AST. The AST stores basic information about PL definitions, as well as global symbol tables, global label tables, and global exception tables. The AST of each statement records logical information pointing to these global tables.

  • Code Generator (code generator)

    The code generator uses the LLVM interface to further translate the AST into intermediate representation (IR) code. The IR code can be used to verify the correctness of the translation process.

  • Compiler (compiler)

    The compiler uses just-in-time (JIT) compilation to generate machine code from the IR code and outputs the result as an executable PL object.

  • Executor (executor)

    The executor constructs an execution environment based on the compiled PL executable object and input parameters, and calls the function pointer to obtain the function result.

  • PL Cache (execution plan cache module)

    The PL cache is an internal mechanism of the PL engine. The PL engine provides a unified interface for executing procedures or functions by ID. External applications do not need to understand the PL cache mechanism; they only need to execute PL through the PL engine. The PL cache is a hash table that maps keys (procedure or function IDs) to values (PL execution objects). The schema is accessed to check the validity of the PL execution object. If the object is invalid, it is immediately deleted. The PL cache avoids recompiling PL every time, thereby improving the execution efficiency of PL. Therefore, there is no need to use the cache for anonymous blocks.

The PL engine provides an Execute interface, which includes parameters such as PL ID, execution parameters params, execution environment Context, and execution result Result (only valid for functions). First, the PL engine checks the PL cache for a compiled PL executable object based on the ID. If a compiled object is found, it further checks the version. If no valid PL executable object is found in the cache, the compilation process is called to compile the PL. The compiled result is cached in the PL cache and then passed to the executor for execution. The compilation result is a memory address of binary code. The executor converts this memory address into a function pointer and runs it with the execution parameters and environment to obtain the result.