X-Git-Url: https://git.xonotic.org/?p=xonotic%2Fgmqcc.git;a=blobdiff_plain;f=README;h=052082b8d44c64504f387e898055c5db7b0b7b88;hp=d069def2cf042f11d2df6f8047859712a1a89095;hb=4a1f67bb973d3e4ae8726eed7d1f7df51e4b05b3;hpb=48a95ec3c9a4a8601f97c8ac1adbd7d94ba15465 diff --git a/README b/README index d069def..052082b 100644 --- a/README +++ b/README @@ -1,41 +1,239 @@ -This is my work in progress C compiler. There are very few _good_ qc +This is a work in progress Quake C compiler. There are very few good QC compilers out there on the internet that can be used in the opensource community. There are a lot of mediocre compilers, but no one wants those. -This is the solution for that, for once a proper quake c compiler that is -capable of doing proper optimization. The design so far of this compiler -is basic, because it doesn't actually compile code yet. - -gmqcc.h - This is the common header with all definitions, structures, and - constants for everything. - -error.c - This is the error subsystem, this handles the output of good detailed - error messages (not currently, but will), with colors and such. - -lex.c - This is the lexer, a very small basic step-seek lexer that can be easily - changed to add new tokens, very retargetable. - -main.c - This is the core compiler entry, handles switches (will) to toggle on - and off certian compiler features. - -parse.c - This is the parser which goes over all tokens and generates a parse tree - (not currently, but will) and check for syntax correctness. - -README - This is the file you're currently reading - -Makefile - The makefile, when sources are added you should add them to the SRC= - line otherwise the build will not pick it up. Trivial stuff, small - easy to manage makefile, no need to complicate it. - Some targets: - #make gmqcc - Builds gmqcc, creating a gmqcc binary file in the current - directory as the makefile. +This is the solution for that, for once a proper Quake C compiler that is +capable of doing proper optimization. + +The compiler is intended to implement modern day compiler design princibles +and support modifications through extensions that are provided for the +user through a low-level syntax specific-language inside the language itself +to implement language functionality. + +The design goals of the compiler are very large, it's intended the compiler +supports a multitude of things, these things along with the status of +completeness is represented below in a table. + ++-------------------+-----------------------------+------------------+ +| Feature | What's it for? | Complete Factor | ++-------------------+-----------------------------+------------------+ +. Lexical analysis . Tokenization . 90% . +.-------------------.-----------------------------.------------------. +. Tokenization . Parsing . 90% . +.-------------------.-----------------------------.------------------. +. Parsing / SYA . AST Generation . 09% . +.-------------------.-----------------------------.------------------. +. AST Generation . IR Generation . ??% . +.-------------------.-----------------------------.------------------. +. IR Generation . Code Generation . ??% . +.-------------------.-----------------------------.------------------. +. Code Generation . Binary Generation . ??% . +.-------------------.-----------------------------.------------------. +. Binary Generation . Binary . 100% . ++-------------------+-----------------------------+------------------+ + +Design tree: + The compiler is intended to work in the following order: + Lexical analysis -> + Tokenization -> + Parsing: + Operator precedence: + Shynting yard algorithm + Inline assembly: + Usage of the assembler subsystem: + top-down parsing and assemblation no optimization + Other parsing: + recrusive decent + -> + Abstract syntax tree generation -> + Immediate representation (SSA): + Optimizations: + Constant propagation + Value range propogation + Sparse conditional constant propagation (possibly?) + Dead code elimination + Constant folding + Global value numbering + Partial redundancy elimination + Strength reduction + Common subexpression elimination + Peephole optimizations + Loop-invariant code motion + Inline expansion + Constant folding + Induction variable recognition and elimination + Dead store elimination + Jump threading + -> + Code Generation: + Optimizations: + Rematerialization + Code Factoring + Recrusion Elimination + Loop unrolling + Deforestation + -> + Binary Generation + +File tree and explination: + gmqcc.h + This is the common header with all definitions, structures, and + constants for everything. + + error.c + This is the error subsystem, this handles the output of good detailed + error messages (not currently, but will), with colors and such. + + lex.c + This is the lexer, a very small basic step-seek lexer that can be easily + changed to add new tokens, very retargetable. + + main.c + This is the core compiler entry, handles switches (will) to toggle on + and off certian compiler features. + + parse.c + This is the parser which goes over all tokens and generates a parse tree + and check for syntax correctness. + + typedef.c + This is the typedef system, this is a seperate file because it's a lot more + complicated than it sounds. This handles all typedefs, and even recrusive + typedefs. + + util.c + These are utilities for the compiler, some things in here include a + allocator used for debugging, and some string functions. + + assembler.c + This implements support for assembling Quake assembler (which doesn't + actually exist untill now: documentation of the Quake assembler is below. + This also implements (will) inline assembly for the C compiler. + + README + This is the file you're currently reading + + Makefile + The makefile, when sources are added you should add them to the SRC= + line otherwise the build will not pick it up. Trivial stuff, small + easy to manage makefile, no need to complicate it. + Some targets: + #make gmqcc + Builds gmqcc, creating a `gmqcc` binary file in the current + directory as the makefile. + #make test + Builds the ir and ast tests, creating a `test_ir` and `test_ast` + binary file in the current directory as the makefile. + #make test_ir + Builds the ir test, creating a `test_ir` binary file in the + current directory as the makefile. + #make test_ast + Builds the asr test, creating a `test_ast` binary file in the + current directory as the makefile. + #make clean + Cleans the build files left behind by a previous build, as + well as all the binary files. + #make all + Builds the tests and the compiler binary all in the current + directory of the makefile. + +//////////////////////////////////////////////////////////////////////// +///////////////////// Quake Assembler Documentation //////////////////// +//////////////////////////////////////////////////////////////////////// +Quake assembler is quite simple: it's just an annotated version of the binary +produced by any existing QuakeC compiler, but made cleaner to use, (so that +the location of various globals or strings are not required to be known). + +Constants: + Using one of the following valid constant typenames, you can declare + a constant {FLOAT,VECTOR,FUNCTION,FIELD,ENTITY}, all typenames are + proceeded by a colon, and the name (white space doesn't matter). + + Examples: + FLOAT: foo 1 + VECTOR: bar 1 2 1 + STRING: hello "hello world" + +Comments: + Commenting assembly requires the use of either # or ; on the line + that you'd like to be ignored by the assembler. You can only comment + blank lines, and not lines assembly already exists on. + + Examples: + ; this is allowed + # as is this + FLOAT: foo 1 ; this is not allowed + FLOAT: bar 2 # neither is this + +Functions: + Creating functions is the same as declaring a constant, simply use + FUNCTION followed by a colon, and the name (white space doesn't matter) + and start the statements for that function on the line after it + + Examples: + FLOAT: foo 1 + FLOAT: bar 2 + FUNCTION: test1 + ADD foo, bar, OFS_RETURN + RETURN + + FUNCTION: test2 + CALL0 test1 + DONE - #make clean - Cleans the build files left behind by a previous build +Internal: + The Quake engine provides some internal functions such as print, to + access these you first must declare them and their names. To do this + you create a FUNCTION as you currently do. Adding a $ followed by the + number of the engine builtin (negated). + + Examples: + FUNCTION: print $4 + FUNCTION: error $3 + +Misc: + There are some rules as to what your identifiers can be for functions + and constants. All indentifiers mustn't begin with a numeric digit, + identifiers cannot include spaces, or tabs; they cannot contain symbols, + and they cannot exceed 32768 characters. Identifiers cannot be all + capitalized either, as all capatilized identifiers are reserved by the + assembler. + + Numeric constants cannot contain special notation such as `1-e10`, all + numeric constants have to be numeric, they can contain decmial points + and signs (+, -) however. + + Constants cannot be assigned values of other constants, their value must + be fully expressed inspot of the declartion. + + No two identifiers can be the same name, this applies for variables allocated + inside a function scope (despite it being considered local). + + There exists one other keyword that is considered sugar, and that + is AUTHOR, this keyword will allow you to speciy the AUTHOR(S) of + the assembly being assembled. The string represented for each usage + of AUTHOR is wrote to the end of the string table. Simaler to the + usage of constants and functions the AUTHOR keyword must be proceeded + by a colon. + + Examples: + AUTHOR: "Dale Weiler" + AUTHOR: "Wolfgang Bumiller" + + Colons exist for the sole reason of not having to use spaces after + keyword usage (however spaces are allowed). To understand the + following examples below are equivlent. + + Example 1: + FLOAT:foo 1 + Example 2: + FLOAT: foo 1 + Example 3: + FLOAT: foo 2 + + variable amounts of whitespace is allowed anywhere (as it should be). + think of `:` as a delimiter (which is what it's used for during assembly). + +//////////////////////////////////////////////////////////////////////// +/////////////////////// Quake C Documentation ////////////////////////// +//////////////////////////////////////////////////////////////////////// +TODO ....