Vector: Query vector

A query vector is a tokenized version of a query sequence for a given timeframe. This method is still work in progress. Focus has been on obfuscating the sequences to avoid re-identification of the client, which will limit the effectiveness of the analysis. The token space is deliberately small and tokens are generated by a hash function where collisions are to be expected. Any new tokens found are submitted with the vectors.

Data

#	Name	Type	Required	Comment
1	StartTime	Timestamp	yes	Starting point for vector
2	Duration	Integer	yes	Vector length in seconds
3	Vectors	list<Bytestring>	yes	Vectors for all clients for the given time window. The vectors consist of tokens that are 32 bit long hashes of the word they represent
4	Wordlist delta	list<Bytestring>	yes	Wordlist for all tokens not on the default list, ie the list of new words

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QueryVector.md

QueryVector.md

Vector: Query vector

Data

Files

QueryVector.md

Latest commit

History

QueryVector.md

File metadata and controls

Vector: Query vector

Data