SUPPORT THE WORK

GetWiki

lexical analysis

ARTICLE SUBJECTS
aesthetics  →
being  →
complexity  →
database  →
enterprise  →
ethics  →
fiction  →
history  →
internet  →
knowledge  →
language  →
licensing  →
linux  →
logic  →
method  →
news  →
perception  →
philosophy  →
policy  →
purpose  →
religion  →
science  →
sociology  →
software  →
truth  →
unix  →
wiki  →
ARTICLE TYPES
essay  →
feed  →
help  →
system  →
wiki  →
ARTICLE ORIGINS
critical  →
discussion  →
forked  →
imported  →
original  →
lexical analysis
[ temporary import ]
please note:
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
{{redirect|Lexer|people with this name|Lexer (surname)}}In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of tokens (strings with an assigned and thus identified meaning). A program that performs lexical analysis may be termed a lexer, tokenizer,WEB,weblink Anatomy of a Compiler and The Tokenizer, www.cs.man.ac.uk, or scanner, though scanner is also a term for the first stage of a lexer. A lexer is generally combined with a parser, which together analyze the syntax of programming languages, web pages, and so forth.

Applications

A lexer forms the first phase of a compiler frontend in modern processing. Analysis generally occurs in one pass.In older languages such as ALGOL, the initial stage was instead line reconstruction, which performed unstropping and removed whitespace and comments (and had scannerless parsers, with no separate lexer). These steps are now done as part of the lexer.Lexers and parsers are most often used for compilers, but can be used for other computer language tools, such as prettyprinters or linters. Lexing can be divided into two stages: the scanning, which segments the input string into syntactic units called lexemes and categorizes these into token classes; and the evaluating, which converts lexemes into processed values.Lexers are generally quite simple, with most of the complexity deferred to the parser or semantic analysis phases, and can often be generated by a lexer generator, notably lex or derivatives. However, lexers can sometimes include some complexity, such as phrase structure processing to make input easier and simplify the parser, and may be written partly or fully by hand, either to support more features or for performance.

Lexeme

A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token.page 111, "Compilers Principles, Techniques, & Tools, 2nd Ed." (WorldCat) by Aho, Lam, Sethi and Ullman, as quoted inweblink authors term this a "token", using "token" interchangeably to represent the string being tokenized, and the token data structure resulting from putting this string through the tokenization process.WEB,weblink perlinterp: Perl 5 version 24.0 documentation, Perl 5 Porters, perldoc.perl.org - Official documentation for the Perl programming language, perldoc.perl.org, 26 January 2017, WEB,weblink What is the difference between token and lexeme?, Guy Coder, 19 February 2013, Stack Overflow, Stack Exchange Inc, 26 January 2017, The word lexeme in computer science is defined differently than lexeme in linguistics. A lexeme in computer science roughly corresponds to what might be termed a word in linguistics (the term word in computer science has a different meaning than word in linguistics), although in some cases it may be more similar to a morpheme.

Token

A lexical token or simply token is a string with an assigned and thus identified meaning. It is structured as a pair consisting of a token name and an optional token value. The token name is a category of lexical unit. Common token names are
  • identifier: names the programmer chooses;
  • keyword: names already in the programming language;
  • separator (also known as punctuators): punctuation characters and paired-delimiters;
  • operator: symbols that operate on arguments and produce results;
  • literal: numeric, logical, textual, reference literals;
  • comment: line, block.
{|class="wikitable"|+ Examples of token values! Token name !! Sample token values
x}}, {{codeUP}}
2=c2=c2=c|return}}
| }, (, ;
2=c2=c|1=


- content above as imported from Wikipedia
- "lexical analysis" does not exist on GetWiki (yet)
- time: 6:54pm EST - Fri, Dec 14 2018
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
GETWIKI 09 MAY 2016
GETWIKI 18 OCT 2015
M.R.M. Parrott
Biographies
GETWIKI 20 AUG 2014
GETWIKI 19 AUG 2014
GETWIKI 18 AUG 2014
Wikinfo
Culture
CONNECT