kdbplus Archives - Donotsad where can learn any thing work project and make money

Aug 10

KDB+ – Discussion

Discuss KDB+ ”; Previous Next Kdb+ is a high-performance column-oriented database from Kx Systems Inc. kdb+ is designed to capture, analyze, compare, and store data − all at high speeds and on high volumes of data. The tutorial starts off with a basic introduction of Kdb+ followed by its architecture, installation, and a basic-to-advanced coverage of q programming language. Print Page Previous Next Advertisements ”;

Aug 10

KDB+ – Quick Guide

KDB+ – Quick Guide ”; Previous Next KDB+ – Overview This is a complete quide to kdb+ from kx systems, aimed primarily at those learning independently. kdb+, introduced in 2003, is the new generation of the kdb database which is designed to capture, analyze, compare, and store data. A kdb+ system contains the following two components − KDB+ − the database (k database plus) Q − the programming language for working with kdb+ Both kdb+ and q are written in k programming language (same as q but less readable). Background Kdb+/q originated as an obscure academic language but over the years, it has gradually improved its user friendliness. APL (1964, A Programming Language) A+ (1988, modified APL by Arthur Whitney) K (1993, crisp version of A+, developed by A. Whitney) Kdb (1998, in-memory column-based db) Kdb+/q (2003, q language – more readable version of k) Why and Where to Use KDB+ Why? − If you need a single solution for real-time data with analytics, then you should consider kdb+. Kdb+ stores database as ordinary native files, so it does not have any special needs regarding hardware and storage architecture. It is worth pointing out that the database is just a set of files, so your administrative work won’t be difficult. Where to use KDB+? − It’s easy to count which investment banks are NOT using kdb+ as most of them are using currently or planning to switch from conventional databases to kdb+. As the volume of data is increasing day by day, we need a system that can handle huge volumes of data. KDB+ fulfills this requirement. KDB+ not only stores an enormous amount of data but also analyzes it in real time. Getting Started With this much of background, let us now set forth and learn how to set up an environment for KDB+. We will start with how to download and install KDB+. Downloading & Installing KDB+ You can get the free 32-bit version of KDB+, with all the functionality of the 64- bit version from http://kx.com/software-download.php Agree to the license agreement, select the operating system (available for all major operating system). For Windows operating system, the latest version is 3.2. Download the latest version. Once you unzip it, you will get the folder name “windows” and inside the windows folder, you will get another folder “q”. Copy the entire q folder onto your c:/ drive. Open the Run terminal, type the location where you store the q folder; it will be like “c:/q/w32/q.exe”. Once you hit Enter, you will get a new console as follows − On the first line, you can see the version number which is 3.2 and the release date as 2015.03.05 Directory Layout The trial/free version is generally installed in directories, For linux/Mac − ~/q / main q directory (under the user’s home) ~/q/l32 / location of linux 32-bit executable ~/q/m32 / Location of mac 32-bit executable For Windows − c:/q / Main q directory c:/q/w32/ / Location of windows 32-bit executable Example Files − Once you download kdb+, the directory structure in the Windows platform would appear as follows − In the above directory structure, trade.q and sp.q are the example files which we can use as a reference point. KDB+ – Architecture Kdb+ is a high-performance, high-volume database designed from the outset to handle tremendous volumes of data. It is fully 64-bit, and has built-in multi-core processing and multi-threading. The same architecture is used for real-time and historical data. The database incorporates its own powerful query language, q, so analytics can be run directly on the data. kdb+tick is an architecture which allows the capture, processing, and querying of real-time and historical data. Kdb+/ tick Architecture The following illustration provides a generalized outline of a typical Kdb+/tick architecture, followed by a brief explanation of the various components and the through-flow of data. The Data Feeds are a time series data that are mostly provided by the data feed providers like Reuters, Bloomberg or directly from exchanges. To get the relevant data, the data from the data feed is parsed by the feed handler. Once the data is parsed by the feed handler, it goes to the ticker-plant. To recover data from any failure, the ticker-plant first updates/stores the new data to the log file and then updates its own tables. After updating the internal tables and the log files, the on-time loop data is continuously sent/published to the real-time database and all the chained subscribers who requested for data. At the end of a business day, the log file is deleted, a new one created and the real-time database is saved onto the historical database. Once all the data is saved onto the historical database, the real-time database purges its tables. Components of Kdb+ Tick Architecture Data Feeds Data Feeds can be any market or other time series data. Consider data feeds as the raw input to the feed-handler. Feeds can be directly from the exchange (live-streaming data), from the news/data providers like Thomson-Reuters, Bloomberg, or any other external agencies. Feed Handler A feed handler converts the data stream into a format suitable for writing to kdb+. It is connected to the data feed and it retrieves and converts the data from the feed-specific format into a Kdb+ message which is published to the ticker-plant process. Generally a feed handler is used to perform the following operations − Capture data according to a set of rules. Translate (/enrich) that data from one format to another. Catch the most recent values. Ticker Plant Ticker Plant is the most important component of KDB+ architecture. It is the ticker plant with which the real-time database or directly subscribers (clients) are connected to access the financial data. It operates in publish and subscribe mechanism. Once you obtain a subscription (license), a tick (routinely) publication from the publisher (ticker plant) is defined. It performs the following operations − Receives the data from the feed handler. Immediately after the ticker plant receives the

Aug 10

Q Language – Attributes

Q Language – Attributes ”; Previous Next Lists, dictionaries, or columns of a table can have attributes applied to them. Attributes impose certain properties on the list. Some attributes might disappear on modification. Types of Attributes Sorted (`s#) `s# means the list is sorted in an ascending order. If a list is explicitly sorted by asc (or xasc), the list will automatically have the sorted attribute set. q)L1: asc 40 30 20 50 9 4 q)L1 `s#4 9 20 30 40 50 A list which is known to be sorted can also have the attribute explicitly set. Q will check if the list is sorted, and if is not, an s-fail error will be thrown. q)L2:30 40 24 30 2 q)`s#L2 ”s-fail The sorted attribute will be lost upon an unsorted append. Parted (`p#) `p# means the list is parted and identical items are stored contiguously. The range is an int or temporal type having an underlying int value, such as years, months, days, etc. You can also partition over a symbol provided it is enumerated. Applying the parted attribute creates an index dictionary that maps each unique output value to the position of its first occurrence. When a list is parted, lookup is much faster, since linear search is replaced by hashtable lookup. q)L:`p# 99 88 77 1 2 3 q)L `p#99 88 77 1 2 3 q)L,:3 q)L 99 88 77 1 2 3 3 Note − The parted attribute is not preserved under an operation on the list, even if the operation preserves the partitioning. The parted attribute should be considered when the number of entities reaches a billion and most of the partitions are of substantial size, i.e., there is significant repetition. Grouped (`g#) `g# means the list is grouped. An internal dictionary is built and maintained which maps each unique item to each of its indices, requiring considerable storage space. For a list of length L containing u unique items of size s, this will be (L × 4) + (u × s) bytes. Grouping can be applied to a list when no other assumptions about its structure can be made. The attribute can be applied to any typed lists. It is maintained on appends, but lost on deletes. q)L: `g# 1 2 3 4 5 4 2 3 1 4 5 6 q)L `g#1 2 3 4 5 4 2 3 1 4 5 6 q)L,:9 q)L `g#1 2 3 4 5 4 2 3 1 4 5 6 9 q)L _:2 q)L 1 2 4 5 4 2 3 1 4 5 6 9 Unique (`#u) Applying the unique attribute (`u#) to a list indicates that the items of the list are distinct. Knowing that the elements of a list are unique dramatically speeds up distinct and allows q to execute some comparisons early. When a list is flagged as unique, an internal hash map is created to each item in the list. Operations on the list must preserve uniqueness or the attribute is lost. q)LU:`u#`MSFT`SAMSUNG`APPLE q)LU `u#`MSFT`SAMSUNG`APPLE q)LU,:`IBM /Uniqueness preserved q)LU `u#`MSFT`SAMSUNG`APPLE`IBM q)LU,:`SAMSUNG / Attribute lost q)LU `MSFT`SAMSUNG`APPLE`IBM`SAMSUNG Note − `u# is preserved on concatenations which preserve the uniqueness. It is lost on deletions and non-unique concatenations. Searches on `u# lists are done via a hash function. Removing Attributes Attributes can be removed by applying `#. Applying Attributes Three formats for applying attributes are − L: `s# 14 2 3 3 9/ Specify during list creation @[ `.; `L ; `s#]/ Functional apply, i.e. to the variable list L / in the default namespace (i.e. `.) apply / the sorted `s# attribute Update `s#time from `tab / Update the table (tab) to apply the / attribute. Let’s apply the above three different formats with examples. q)/ set the attribute during creation q)L:`s# 3 4 9 10 23 84 90 q)/apply the attribute to existing list data q)L1: 9 18 27 36 42 54 q)@[`.;`L1;`s#] `. q)L1 / check `s#9 18 27 36 42 54 q)@[`.;`L1;`#] / clear attribute `. q)L1 9 18 27 36 42 54 q)/update a table to apply the attribute q)t: ([] sym:`ibm`msft`samsung; mcap:9000 18000 27000) q)t:([]time:09:00 09:30 10:00t;sym:`ibm`msft`samsung; mcap:9000 18000 27000) q)t time sym mcap ——————————— 09:00:00.000 ibm 9000 09:30:00.000 msft 18000 10:00:00.000 samsung 27000 q)update `s#time from `t `t q)meta t / check it was applied c | t f a —— | —– time | t s sym | s mcap | j Above we can see that the attribute column in meta table results shows the time column is sorted (`s#). Print Page Previous Next Advertisements ”;

Aug 10

Q Language – Indexing

Q Language – Indexing ”; Previous Next A list is ordered from left to right by the position of its items. The offset of an item from the beginning of the list is called its index. Thus, the first item has an index 0, the second item (if there is one) has an index 1, etc. A list of count n has index domain from 0 to n–1. Index Notation Given a list L, the item at index i is accessed by L[i]. Retrieving an item by its index is called item indexing. For example, q)L:(99;98.7e;`b;`abc;”z”) q)L[0] 99 q)L[1] 98.7e q)L[4] “z Indexed Assignment Items in a list can also be assigned via item indexing. Thus, q)L1:9 8 7 q)L1[2]:66 / Indexed assignment into a simple list / enforces strict type matching. q)L1 9 8 66 Lists from Variables q)l1:(9;8;40;200) q)l2:(1 4 3; `abc`xyz) q)l:(l1;l2) / combining the two list l1 and l2 q)l 9 8 40 200 (1 4 3;`abc`xyz) Joining Lists The most common operation on two lists is to join them together to form a larger list. More precisely, the join operator (,) appends its right operand to the end of the left operand and returns the result. It accepts an atom in either argument. q)1,2 3 4 1 2 3 4 q)1 2 3, 4.4 5.6 / If the arguments are not of uniform type, / the result is a general list. 1 2 3 4.4 5.6 Nesting Data complexity is built by using lists as items of lists. Depth The number of levels of nesting for a list is called its depth. Atoms have a depth of 0 and simple lists have a depth of 1. q)l1:(9;8;(99;88)) q)count l1 3 Here is a list of depth 3 having two items − q)l5 9 (90;180;900 1800 2700 3600) q)count l5 2 q)count l5[1] 3 Indexing at Depth It is possible to index directly into the items of a nested list. Repeated Item Indexing Retrieving an item via a single index always retrieves an uppermost item from a nested list. q)L:(1;(100;200;(300;400;500;600))) q)L[0] 1 q)L[1] 100 200 300 400 500 600 Since the result L[1] is itself a list, we can retrieve its elements using a single index. q)L[1][2] 300 400 500 600 We can repeat single indexing once more to retrieve an item from the innermost nested list. q)L[1][2][0] 300 You can read this as, Get the item at index 1 from L, and from it retrieve the item at index 2, and from it retrieve the item at index 0. Notation for Indexing at Depth There is an alternate notation for repeated indexing into the constituents of a nested list. The last retrieval can also be written as, q)L[1;2;0] 300 Assignment via index also works at depth. q)L[1;2;1]:900 q)L 1 (100;200;300 900 500 600) Elided Indices Eliding Indices for a General List q)L:((1 2 3; 4 5 6 7); (`a`b`c;`d`e`f`g;`0`1`2);(“good”;”morning”)) q)L (1 2 3;4 5 6 7) (`a`b`c;`d`e`f`g;`0`1`2) (“good”;”morning”) q)L[;1;] 4 5 6 7 `d`e`f`g “morning” q)L[;;2] 3 6 `c`f`2 “or” Interpret L[;1;] as, Retrieve all items in the second position of each list at the top level. Interpret L[;;2] as, Retrieve the items in the third position for each list at the second level. Print Page Previous Next Advertisements ”;

Aug 10

Q Language – Maintenance Functions

Q Language – Maintenance Functions ”; Previous Next .Q.en .Q.en is a dyadic function which help in splaying a table by enumerating a symbol column. It is especially useful when we are dealing with historical db (splayed, partition tables etc.). − .Q.en[`:directory;table] where directory is the home directory of the historical database where sym file is located and table is the table to be enumerated. Manual enumeration of tables is not required to save them as splayed tables, as this will be done by − .Q.en[`:directory_where_symbol_file_stored]table_name .Q.dpft The .Q.dpft function helps in creating partitioned and segmented tables. It is advanced form of .Q.en, as it not only splays the table but also creates a partition table. There are four arguments used in .Q.dpft − symbolic file handle of the database where we want to create a partition, q data value with which we are going to partition the table, name of the field with which parted (`p#) attribute is going to be applied (usually `sym), and the table name. Let’s take an example to see how it works − q)tab:([]sym:5?`msft`hsbc`samsung`ibm;time:5?(09:30:30);price:5?30.25) q).Q.dpft[`:c:/q/;2014.08.24;`sym;`tab] `tab q)delete tab from ` ”type q)delete tab from `/ ”type q)delete tab from . ”type q)delete tab from `. `. q)tab ”tab We have deleted the table tab from the memory. Let us now load it from the db q)l c:/q/2014.08.24/ q)a ,`tab q)tab sym time price ——————————- hsbc 07:38:13 15.64201 hsbc 07:21:05 5.387037 msft 06:16:58 11.88076 msft 08:09:26 12.30159 samsung 04:57:56 15.60838 .Q.chk .Q.chk is a monadic function whose single parameter is the symbolic file handle of the root directory. It creates empty tables in a partition, wherever necessary, by examining each partition subdirectories in the root. .Q.chk `:directory where directory is the home directory of the historical database. Print Page Previous Next Advertisements ”;

Aug 10

Q Language – Joins

Q Language – Joins ”; Previous Next In q language, we have different kinds of joins based on the input tables supplied and the kind of joined tables we desire. A join combines data from two tables. Besides foreign key chasing, there are four other ways to join tables − Simple join Asof join Left join Union join Here, in this chapter, we will discuss each of these joins in detail. Simple Join Simple join is the most basic type of join, performed with a comma ‘,’. In this case, the two tables have to be type conformant, i.e., both the tables have the same number of columns in the same order, and same key. table1,:table2 / table1 is assigned the value of table2 We can use comma-each join for tables with same length to join sideways. One of the tables can be keyed here, Table1, `Table2 Asof Join (aj) It is the most powerful join which is used to get the value of a field in one table asof the time in another table. Generally it is used to get the prevailing bid and ask at the time of each trade. General format aj[joinColumns;tbl1;tbl2] For example, aj[`sym`time;trade;quote] Example q)tab1:([]a:(1 2 3 4);b:(2 3 4 5);d:(6 7 8 9)) q)tab2:([]a:(2 3 4);b:(3 4 5); c:( 4 5 6)) q)show aj[`a`b;tab1;tab2] a b d c ————- 1 2 6 2 3 7 4 3 4 8 5 4 5 9 6 Left Join(lj) It’s a special case of aj where the second argument is a keyed table and the first argument contains the columns of the right argument’s key. General format table1 lj Keyed-table Example q)/Left join- syntax table1 lj table2 or lj[table1;table2] q)tab1:([]a:(1 2 3 4);b:(2 3 4 5);d:(6 7 8 9)) q)tab2:([a:(2 3 4);b:(3 4 5)]; c:( 4 5 6)) q)show lj[tab1;tab2] a b d c ————- 1 2 6 2 3 7 4 3 4 8 5 4 5 9 6 Union Join (uj) It allows to create a union of two tables with distinct schemas. It is basically an extension to the simple join ( , ) q)tab1:([]a:(1 2 3 4);b:(2 3 4 5);d:(6 7 8 9)) q)tab2:([]a:(2 3 4);b:(3 4 5); c:( 4 5 6)) q)show uj[tab1;tab2] a b d c ———— 1 2 6 2 3 7 3 4 8 4 5 9 2 3 4 3 4 5 4 5 6 If you are using uj on keyed tables, then the primary keys must match. Print Page Previous Next Advertisements ”;

Aug 10

KDB+ – Useful Resources

KDB+ – Useful Resources ”; Previous Next The following resources contain additional information on KDB+. Please use them to get more in-depth knowledge on this. Useful Links on KDB+ KDB+ – Official Site of KDB+. Q Programming Language Wiki – Wikipedia reference for Q Programming Language. Useful Books on KDB+ To enlist your site on this page, please drop an email to [email protected] Print Page Previous Next Advertisements ”;

Aug 10

Q – Message Handler (.Z Library)

Q Language – Message Handler ”; Previous Next When a q process connects to another q process via inter-process communication, it is processed by message handlers. These message handlers have a default behavior. For example, in case of synchronous message handling, the handler returns the value of the query. The synchronous handler in this case is .z.pg, which we could override as per requirement. Kdb+ processes have several pre-defined message handlers. Message handlers are important for configuring the database. Some of the usages include − Logging − Log incoming messages (helpful in case of any fatal error), Security − Allow/disallow access to the database, certain function calls, etc., based on username / ip address. It helps in providing access to authorized subscribers only. Handle connections/disconnections from other processes. Predefined Message Handlers Some of the predefined message handlers are discussed below. .z.pg It is a synchronous message handler (process get). This function gets called automatically whenever a sync message is received on a kdb+ instance. Parameter is the string/function call to be executed, i.e., the message passed. By default, it is defined as follows − .z.pg: {value x} / simply execute the message received but we can overwrite it to give any customized result. .z.pg : {handle::.z.w;value x} / this will store the remote handle .z.pg : {show .z.w;value x} / this will show the remote handle .z.ps It is an asynchronous message handler (process set). It is the equivalent handler for asynchronous messages. Parameter is the string/function call to be executed. By default, it is defined as, .z.pg : {value x} / Can be overriden for a customized action. Following is the customized message handler for asynchronous messages, where we have used the protected execution, .z.pg: {@[value; x; errhandler x]} Here errhandler is a function used in case of any unexpected error. .z.po[] It is a connection open handler (process-open). It is executed when a remote process opens a connection. To see the handle when a connection to a process is opened, we can define the .z.po as, .z.po : {Show “Connection opened by” , string h: .z.h} .z.pc[] It is a close connection handler (process-close). It is called when a connection is closed. We can create our own close handler which can reset the global connection handle to 0 and issue a command to set the timer to fire (execute) every 3 seconds (3000 milliseconds). .z.pc : { h::0; value “\t 3000”} The timer handler (.z.ts) attempts to re-open the connection. On success, it turns the timer off. .z.ts : { h:: hopen `::5001; if [h>0; value “\t 0”] } .z.pi[] PI stands for process input. It is called for any sort of input. It can be used to handle console input or remote client input. Using .z.pi[], one can validate the console input or replace the default display. In addition, it can be used for any sort of logging operations. q).z.pi ”.z.pi q).z.pi:{“>”, .Q.s value x} q)5+4 >9 q)30+42 >72 q)30*2 >60 q)x .z.pi >q) q)5+4 9 .z.pw It is a validation connection handler (user authentication). It adds an extra callback when a connection is being opened to a kdb+ session. It is called after the –u/-U checks and before the .z.po (port open). .z.pw : {[user_id;passwd] 1b} Inputs are userid (symbol) and password (text). Print Page Previous Next Advertisements ”;

Aug 10

Q Language – Dictionaries

Q Language – Dictionaries ”; Previous Next Dictionaries are an extension of lists which provide the foundation for creating tables. In mathematical terms, dictionary creates the “domain → Range” or in general (short) creates “key → value” relationship between elements. A dictionary is an ordered collection of key-value pairs that is roughly equivalent to a hash table. A dictionary is a mapping defined by an explicit I/O association between a domain list and a range list via positional correspondence. The creation of a dictionary uses the “xkey” primitive (!) ListOfDomain ! ListOfRange The most basic dictionary maps a simple list to a simple list. Input (I) Output (O) `Name `John `Age 36 `Sex “M” Weight 60.3 q)d:`Name`Age`Sex`Weight!(`John;36;”M”;60.3) / Create a dictionary d q)d Name | `John Age | 36 Sex | “M” Weight | 60.3 q)count d / To get the number of rows in a dictionary. 4 q)key d / The function key returns the domain `Name`Age`Sex`Weight q)value d / The function value returns the range. `John 36 “M” 60.3 q)cols d / The function cols also returns the domain. `Name`Age`Sex`Weight Lookup Finding the dictionary output value corresponding to an input value is called looking up the input. q)d[`Name] / Accessing the value of domain `Name `John q)d[`Name`Sex] / extended item-wise to a simple list of keys `John “M” Lookup with Verb @ q)d1:`one`two`three!9 18 27 q)d1[`two] 18 q)d1@`two 18 Operations on Dictionaries Amend and Upsert As with lists, the items of a dictionary can be modified via indexed assignment. d:`Name`Age`Sex`Weight! (`John;36;”M”;60.3) / A dictionary d q)d[`Age]:35 / Assigning new value to key Age q)d / New value assigned to key Age in d Name | `John Age | 35 Sex | “M” Weight | 60.3 Dictionaries can be extended via index assignment. q)d[`Height]:”182 Ft” q)d Name | `John Age | 35 Sex | “M” Weight | 60.3 Height | “182 Ft” Reverse Lookup with Find (?) The find (?) operator is used to perform reverse lookup by mapping a range of elements to its domain element. q)d2:`x`y`z!99 88 77 q)d2?77 `z In case the elements of a list is not unique, the find returns the first item mapping to it from the domain list. Removing Entries To remove an entry from a dictionary, the delete ( _ ) function is used. The left operand of ( _ ) is the dictionary and the right operand is a key value. q)d2:`x`y`z!99 88 77 q)d2 _`z x| 99 y| 88 Whitespace is required to the left of _ if the first operand is a variable. q)`x`y _ d2 / Deleting multiple entries z| 77 Column Dictionaries Column dictionaries are the basics for creation of tables. Consider the following example − q)scores: `name`id!(`John`Jenny`Jonathan;9 18 27) / Dictionary scores q)scores[`name] / The values for the name column are `John`Jenny`Jonathan q)scores.name / Retrieving the values for a column in a / column dictionary using dot notation. `John`Jenny`Jonathan q)scores[`name][1] / Values in row 1 of the name column `Jenny q)scores[`id][2] / Values in row 2 of the id column is 27 Flipping a Dictionary The net effect of flipping a column dictionary is simply reversing the order of the indices. This is logically equivalent to transposing the rows and columns. Flip on a Column Dictionary The transpose of a dictionary is obtained by applying the unary flip operator. Take a look at the following example − q)scores name | John Jenny Jonathan id | 9 18 27 q)flip scores name id ————— John 9 Jenny 18 Jonathan 27 Flip of a Flipped Column Dictionary If you transpose a dictionary twice, you obtain the original dictionary, q)scores ~ flip flip scores 1b Print Page Previous Next Advertisements ”;

Aug 10

Q Language – Tables on Disk

Q Language – Tables on Disk ”; Previous Next Data on your hard disk (also called historical database) can be saved in three different formats − Flat Files, Splayed Tables, and Partitioned Tables. Here we will learn how to use these three formats to save data. Flat file Flat files are fully loaded into memory which is why their size (memory footprint) should be small. Tables are saved on disk entirely in one file (so size matters). The functions used to manipulate these tables are set/get − `:path_to_file/filename set tablename Let’s take an example to demonstrate how it works − q)tables `. `s#`t`tab`tab1 q)`:c:/q/w32/tab1_test set tab1 `:c:/q/w32/tab1_test In Windows environment, flat files are saved at the location − C:qw32 Get the flat file from your disk (historical db) and use the get command as follows − q)tab2: get `:c:/q/w32/tab1_test q)tab2 sym | time price size ——— | ——————————- APPLE | 11:16:39.779 8.388858 12 MSFT | 11:16:39.779 19.59907 10 IBM | 11:16:39.779 37.5638 1 SAMSUNG | 11:16:39.779 61.37452 90 APPLE | 11:16:39.779 52.94808 73 A new table is created tab2 with its contents stored in tab1_test file. Splayed Tables If there are too many columns in a table, then we store such tables in splayed format, i.e., we save them on disk in a directory. Inside the directory, each column is saved in a separate file under the same name as the column name. Each column is saved as a list of corresponding type in a kdb+ binary file. Saving a table in splayed format is very useful when we have to access only a few columns frequently out of its many columns. A splayed table directory contains .d binary file which contains the order of the columns. Much like a flat file, a table can be saved as splayed by using the set command. To save a table as splayed, the file path should end with a backlash − `:path_to_filename/filename/ set tablename For reading a splayed table, we can use the get function − tablename: get `:path_to_file/filename Note − For a table to be saved as splayed, it should be un-keyed and enumerated. In Windows environment, your file structure will appear as follows − Partitioned Tables Partitioned tables provide an efficient means to manage huge tables containing significant volumes of data. Partitioned tables are splayed tables spread across more partitions (directories). Inside each partition, a table will have its own directory, with the structure of a splayed table. The tables could be split on a day/month/year basis in order to provide optimized access to its content. To get the content of a partitioned table, use the following code block − q)get `:c:/q/data/2000.01.13 // “get” command used, sample folder quote| +`sym`time`bid`ask`bsize`asize`ex!(`p#`sym!0 0 0 0 0 0 0 0 0 0 0 0 0 0…. trade| +`sym`time`price`size`ex!(`p#`sym!0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 …. Let’s try to get the contents of a trade table − q)get `:c:/q/data/2000.01.13/trade sym time price size ex ————————————————– 0 09:30:00.496 0.4092016 7 T 0 09:30:00.501 1.428629 4 N 0 09:30:00.707 0.5647834 6 T 0 09:30:00.781 1.590509 5 T 0 09:30:00.848 2.242627 3 A 0 09:30:00.860 2.277041 8 T 0 09:30:00.931 0.8044885 8 A 0 09:30:01.197 1.344031 2 A 0 09:30:01.337 1.875 3 A 0 09:30:01.399 2.187723 7 A Note − The partitioned mode is suitable for tables with millions of records per day (i.e. time series data) Sym file The sym file is a kdb+ binary file containing the list of symbols from all splayed and partitioned tables. It can be read with, get `:sym par.txt file (optional) This is a configuration file, used when partitions are spread on several directories/disk drives, and contain the paths to the disk partitions. Print Page Previous Next Advertisements ”;