Commit Graph

16 Commits

Author SHA1 Message Date
Nikolai Kosjar
70122b3061 C++: Support for UTF-8 in the lexer
This will save us toLatin1() conversations in CppTools (which already
holds UTF-8 encoded QByteArrays) and thus loss of information (see
QTCREATORBUG-7356). It also gives us support for non-latin1 identifiers.

API-wise the following functions are added to Token. In follow-up
patches these will become handy in combination with QStrings.
    utf16chars() - aequivalent of bytes()
    utf16charsBegin() - aequivalent of bytesBegin()
    utf16charsEnd() - aequivalent of bytesEnd()

Next steps:
 * Adapt functions from TranslationUnit. They should work with utf16
   chars in order to calculate lines and columns correctly also for
   UTF-8 multi-byte code points.
 * Adapt the higher level clients:
    * Cpp{Tools,Editor} should expect UTF-8 encoded Literals.
    * Cpp{Tools,Editor}: When dealing with identifiers on the
      QString/QTextDocument layer, code points
      represendet by two QChars need to be respected, too.
 * Ensure Macro::offsets() and Document::MacroUse::{begin,end}() report
   offsets usable in CppEditor/CppTools.

Addresses QTCREATORBUG-7356.

Change-Id: I0791b5236be8215d24fb8e38a1f7cb0d279454c0
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-05-23 14:23:15 +02:00
Nikolai Kosjar
a9c15c0bf5 C++: Remove Lexer::{tokenOffset(),tokenLength()}
The necessary data can be retrieved by the resulting Token.

Change-Id: I79afb23183c156240c690beff30bb11dfe943e61
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-05-15 14:48:03 +02:00
Nikolai Kosjar
6f63f6b647 C++: Remove unused functions in Lexer
Change-Id: I78b70eead1c64b9925272c50cc6109c5b415574d
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-05-15 14:47:39 +02:00
Orgad Shaneh
e600424648 C++: Fix support for incremental input with \n
Also fix false positive line continuation on blank line

e.g.
"foo \

bar"

Change-Id: Ic6d345a4b578c955411d119b8438c8dc5065c072
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-02-04 11:33:54 +01:00
Orgad Shaneh
0f4e3c356a C++: Support multiline strings and comments
Task-number: QTCREATORBUG-662
Change-Id: I0997fe2afaba71998d5da549b7141df0c023ff12
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-01-21 10:54:56 +01:00
Orgad Shaneh
a309b3cfe6 C++: Store token kind as lexer state
... when needed

Change-Id: I32a1649c87e1fa42da80eff5003b2f5714062064
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-01-20 14:11:18 +01:00
Orgad Shaneh
2216c399e5 C++: Remove unused functions in Lexer
Change-Id: I79285a9fc72f26bdfb7c1600d4e7680e02062593
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2014-01-20 12:29:15 +01:00
hjk
2b532c73ee CPlusPlus: Make (sub-)languague selection more generic
Change-Id: I4e2df6992b446adec662ab07671acd41715e41fd
Reviewed-by: Nikolai Kosjar <nikolai.kosjar@digia.com>
2013-10-15 16:22:28 +02:00
Nikolai Kosjar
b1bb093d15 C++: Fix Qt dependency (Q_UNLIKELY) in 3rdparty/cplusplus
Change-Id: I37ffb657c9e042cc1c186895efd9c58fe6e332fd
Reviewed-by: Przemyslaw Gorszkowski <pgorszkowski@gmail.com>
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2013-04-18 11:34:32 +02:00
hjk
ad0331a2a9 C++: Inline Lexer::control()
Change-Id: Ia37ec33fb031fdea4ad1890fcea3a80b7b46e272
Reviewed-by: Nikolai Kosjar <nikolai.kosjar@digia.com>
2013-04-18 10:49:51 +02:00
hjk
a1c7c47cc0 C++: Simplify Lexer::yyinp()
... by assuming we operate on NUL-terminated data, which is
(in theory) guaranteed by (non-raw) QByteArray which we have.

Change-Id: I855d01ea0dee5328ec737fbabee1086d7a28aa5a
Reviewed-by: Erik Verbruggen <erik.verbruggen@digia.com>
2013-04-16 11:26:30 +02:00
Leandro Melo
e148d030f5 C++: Introduce C++11 raw string literals
Although they are now supported by the lexer
and parser, it is worth to remind that we still
need to address an issue concerning the highlight
of multiline literals (which with the advent of
the new raw strings will become more common).

Task-number: QTCREATORBUG-6722
Change-Id: I137337a9ac0152a1f8b9faded0b960c6fe3dd38a
Reviewed-by: Roberto Raggi <roberto.raggi@nokia.com>
2012-08-23 14:35:02 +02:00
Leandro Melo
b9d15f1296 C++: Avoid looking ahead when lexing u8"literal"
This makes things slightly more efficient. But it will be more
significant when we introduce R"rawliterals" since we would avoid
an even further lookahead for cases like u8R"string".

Change-Id: Id4bad8b917752d23daf2f4989330434979cf602f
Reviewed-by: Roberto Raggi <roberto.raggi@nokia.com>
Reviewed-by: hjk <qthjk@ovi.com>
2012-08-17 15:48:02 +02:00
Leandro Melo
23c637c4f6 C++: Introduce unicode char/strings support
Those are the types char16_t and char32_t along with the new
char/string literals u'', U'', u"", u8"", and U"".

This is particularly important for the use of QStringLiteral
since in some platforms it relies on expansion such as above.

Note: The string literals quickfixes still need some tunning.

Task-number: QTCREATORBUG-7449
Change-Id: Iebcfea15677dc8e0ebb6143def89a5477e1be7d4
Reviewed-by: hjk <qthjk@ovi.com>
2012-06-06 14:55:07 +02:00
Oswald Buddenhagen
b342ad8cf4 remove nokia copyrights from roberto's code
they are lying. nokia has no copyright on this code. and the double
license in a single file looks weird. that's why we moved it to
3rdparty/, so it is clear it is not nokia's.

Approved-by: legal
2011-05-16 11:05:30 +02:00
Oswald Buddenhagen
67704b8b41 move src/shared/cplusplus/ -> src/libs/3rdparty/cplusplus/
Approved-by: legal
2011-05-16 11:05:30 +02:00