Skip to content

Commit

Permalink
feat: add way to style identifiers
Browse files Browse the repository at this point in the history
Replace "default" segment with "whitespace" and "identifier" segments,
with fallback to "unknown" segment.

Also, classify backticked identifiers like `foo` as "identifier" rather than "string".

This allows for identifiers to be styled independently from strings and whitespace.

It also simplifies getSegments() from 30 lines down to 5, by removing the special-case
code for the "default" segment.

BREAKING CHANGE: The `default` segment has been split into `identifier` and `whitespace`
segments.  There's also a new `unknown` segment that will only show up for malformed
SQL such as an unclosed string.

However, the highlight() function works largely the same as before, both normal mode and HTML mode,
except for the bug fix to stop classifying identifiers as strings.  In other words, SQL like

select * from EMP where NAME="John Smith"

will get highlighted the same as before, i.e. no syntax highlighting for EMP or NAME.

Fixes #147
  • Loading branch information
wkeese authored Oct 24, 2023
1 parent 9f9fbe3 commit 01df1cd
Show file tree
Hide file tree
Showing 4 changed files with 87 additions and 111 deletions.
32 changes: 16 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,13 @@ document.body.innerHTML += highlighted
**Output:**
```html
<span class="sql-hl-keyword">SELECT</span>
<span class="sql-hl-string">`id`</span>
<span class="sql-hl-identifier">`id`</span>
<span class="sql-hl-special">,</span>
<span class="sql-hl-string">`username`</span>
<span class="sql-hl-identifier">`username`</span>
<span class="sql-hl-keyword">FROM</span>
<span class="sql-hl-string">`users`</span>
<span class="sql-hl-identifier">`users`</span>
<span class="sql-hl-keyword">WHERE</span>
<span class="sql-hl-string">`email`</span>
<span class="sql-hl-identifier">`email`</span>
<span class="sql-hl-special">=</span>
<span class="sql-hl-string">'[email protected]'</span>
```
Expand Down Expand Up @@ -112,22 +112,22 @@ console.log(segments)
```js
[
{ name: 'keyword', content: 'SELECT' },
{ name: 'default', content: ' ' },
{ name: 'string', content: '`id`' },
{ name: 'whitespace', content: ' ' },
{ name: 'identifier', content: '`id`' },
{ name: 'special', content: ',' },
{ name: 'default', content: ' ' },
{ name: 'string', content: '`username`' },
{ name: 'default', content: ' ' },
{ name: 'whitespace', content: ' ' },
{ name: 'identifier', content: '`username`' },
{ name: 'whitespace', content: ' ' },
{ name: 'keyword', content: 'FROM' },
{ name: 'default', content: ' ' },
{ name: 'string', content: '`users`' },
{ name: 'default', content: ' ' },
{ name: 'whitespace', content: ' ' },
{ name: 'identifier', content: '`users`' },
{ name: 'whitespace', content: ' ' },
{ name: 'keyword', content: 'WHERE' },
{ name: 'default', content: ' ' },
{ name: 'string', content: '`email`' },
{ name: 'default', content: ' ' },
{ name: 'whitespace', content: ' ' },
{ name: 'identifier', content: '`email`' },
{ name: 'whitespace', content: ' ' },
{ name: 'special', content: '=' },
{ name: 'default', content: ' ' },
{ name: 'whitespace', content: ' ' },
{ name: 'string', content: "'[email protected]'" }
]
```
Expand Down
1 change: 1 addition & 0 deletions lib/index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ declare module 'sql-highlight' {
function: string;
number: string;
string: string;
identifier: string;
special: string;
bracket: string;
comment: string;
Expand Down
74 changes: 24 additions & 50 deletions lib/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,19 @@ const DEFAULT_OPTIONS = {
function: '\x1b[31m',
number: '\x1b[32m',
string: '\x1b[32m',
identifier: '\x1b[0m',
special: '\x1b[33m',
bracket: '\x1b[33m',
comment: '\x1b[2m\x1b[90m',
clear: '\x1b[0m'
}
}

const DEFAULT_KEYWORD = 'default'

const highlighters = [
/\b(?<number>\d+(?:\.\d+)?)\b/,

// Note: Repeating string escapes like 'sql''server' will also work as they are just repeating strings
/(?<string>'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*"|`(?:[^`\\]|\\.)*`)/,
/(?<string>'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*")/,

/(?<comment>--[^\n\r]*|#[^\n\r]*|\/\*(?:[^*]|\*(?!\/))*\*\/)/,

Expand All @@ -34,54 +33,29 @@ const highlighters = [

/(?<bracket>[()])/,

/(?<special>!=|[=%*/\-+,;:<>])/
]
/(?<special>!=|[=%*/\-+,;:<>.])/,

function getRegexString (regex) {
const str = regex.toString()
return str.replace(/^\/|\/\w*$/g, '')
}
/(?<identifier>\b\w+\b|`(?:[^`\\]|\\.)*`)/,

/(?<whitespace>\s+)/,

/(?<unknown>\.+?)/
]

// Regex of the shape /(.*?)|((?<token1>...)|(?<token2>...)|...|$)/y
// Regex of the shape /(?<token1>...)|(?<token2>...)|.../g
const tokenizer = new RegExp(
'(.*?)(' +
'\\b(?<keyword>' + keywords.join('|') + ')\\b|' +
highlighters.map(getRegexString).join('|') +
'|$)', // $ needed to to match "default" till the end of string
'isy'
[
'\\b(?<keyword>' + keywords.join('|') + ')\\b',
...highlighters.map(regex => regex.source)
].join('|'),
'gis'
)

function getSegments (sqlString) {
const segments = []
let match

// Reset the starting position
tokenizer.lastIndex = 0

// This is probably the one time when an assignment inside a condition makes sense
// eslint-disable-next-line no-cond-assign
while (match = tokenizer.exec(sqlString)) {
if (match[1]) {
segments.push({
name: DEFAULT_KEYWORD,
content: match[1]
})
}

if (match[2]) {
const name = Object.keys(match.groups).find(key => match.groups[key])
segments.push({
name,
content: match.groups[name]
})
}

// Stop at the end of string
if (match.index + match[0].length >= sqlString.length) {
break
}
}

const segments = Array.from(sqlString.matchAll(tokenizer), match => ({
name: Object.keys(match.groups).find(key => match.groups[key]),
content: match[0]
}))
return segments
}

Expand All @@ -90,14 +64,14 @@ function highlight (sqlString, options) {

return getSegments(sqlString)
.map(({ name, content }) => {
if (name === DEFAULT_KEYWORD) {
return content
}
if (options.html) {
const escapedContent = options.htmlEscaper(content)
return `<span class="${options.classPrefix}${name}">${escapedContent}</span>`
return name === 'whitespace' ? escapedContent : `<span class="${options.classPrefix}${name}">${escapedContent}</span>`
}
if (options.colors[name]) {
return options.colors[name] + content + options.colors.clear
}
return options.colors[name] + content + options.colors.clear
return content
})
.join('')
}
Expand Down
Loading

0 comments on commit 01df1cd

Please sign in to comment.