”;
Many of the functions we have seen so far are working on arrays and tuples. Arrays are just one type of collection, but Julia has other kind of collections too. One such collection is Dictionary object which associates keys with values. That is why it is called an ‘associative collection’.
To understand it better, we can compare it with simple look-up table in which many types of data are organized and provide us the single piece of information such as number, string or symbol called the key. It doesn’t provide us the corresponding data value.
Creating Dictionaries
The syntax for creating a simple dictionary is as follows −
Dict(“key1” => value1, “key2” => value2,,…, “keyn” => valuen)
In the above syntax, key1, key2…keyn are the keys and value1, value2,…valuen are the corresponding values. The operator => is the Pair() function. We can not have two keys with the same name because keys are always unique in dictionaries.
Example
julia> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100
We can also create dictionaries with the help of comprehension syntax. The example is given below −
Example
julia> first_dict = Dict(string(x) => sind(x) for x = 0:5:360) Dict{String,Float64} with 73 entries: "320" => -0.642788 "65" => 0.906308 "155" => 0.422618 "335" => -0.422618 "75" => 0.965926 "50" => 0.766044 ⋮ => ⋮
Keys
As discussed earlier, dictionaries have unique keys. It means that if we assign a value to a key that already exists, we will not be creating a new one but modifying the existing key. Following are some operations on dictionaries regarding keys −
Searching for a key
We can use haskey() function to check whether the dictionary contains a key or not −
julia> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100 julia> haskey(first_dict, "Z") true julia> haskey(first_dict, "A") false
Searching for a key/value pair
We can use in() function to check whether the dictionary contains a key/value pair or not −
julia> in(("X" => 100), first_dict) true julia> in(("X" => 220), first_dict) false
Add a new key-value
We can add a new key-value in the existing dictionary as follows −
julia> first_dict["R"] = 400 400 julia> first_dict Dict{String,Int64} with 4 entries: "Y" => 110 "Z" => 220 "X" => 100 "R" => 400
Delete a key
We can use delete!() function to delete a key from an existing dictionary −
julia> delete!(first_dict, "R") Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100
Getting all the keys
We can use keys() function to get all the keys from an existing dictionary −
julia> keys(first_dict) Base.KeySet for a Dict{String,Int64} with 3 entries. Keys: "Y" "Z" "X"
Values
Every key in dictionary has a corresponding value. Following are some operations on dictionaries regarding values −
Retrieving all the values
We can use values() function to get all the values from an existing dictionary −
julia> values(first_dict) Base.ValueIterator for a Dict{String,Int64} with 3 entries. Values: 110 220 100
Dictionaries as iterable objects
We can process each key/value pair to see the dictionaries are actually iterable objects −
for kv in first_dict println(kv) end "Y" => 110 "Z" => 220 "X" => 100
Here the kv is a tuple that contains each key/value pair.
Sorting a dictionary
Dictionaries do not store the keys in any particular order hence the output of the dictionary would not be a sorted array. To obtain items in order, we can sort the dictionary −
Example
julia> first_dict = Dict("R" => 100, "S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) Dict{String,Int64} with 6 entries: "S" => 220 "U" => 400 "T" => 350 "W" => 670 "V" => 575 "R" => 100 julia> for key in sort(collect(keys(first_dict))) println("$key => $(first_dict[key])") end R => 100 S => 220 T => 350 U => 400 V => 575 W => 670
We can also use SortedDict data type from the DataStructures.ji Julia package to make sure that the dictionary remains sorted all the times. You can check the example below −
Example
julia> import DataStructures julia> first_dict = DataStructures.SortedDict("S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 5 entries: "S" => 220 "T" => 350 "U" => 400 "V" => 575 "W" => 670 julia> first_dict["R"] = 100 100 julia> first_dict DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 6 entries: “R” => 100 “S” => 220 “T” => 350 “U” => 400 “V” => 575 “W” => 670
Word Counting Example
One of the simple applications of dictionaries is to count how many times each word appears in text. The concept behind this application is that each word is a key-value set and the value of that key is the number of times that particular word appears in that piece of text.
In the following example, we will be counting the words in a file name NLP.txtb(saved on the desktop) −
julia> f = open("C://Users//Leekha//Desktop//NLP.txt") IOStream() julia> wordlist = String[] String[] julia> for line in eachline(f) words = split(line, r"W") map(w -> push!(wordlist, lowercase(w)), words) end julia> filter!(!isempty, wordlist) 984-element Array{String,1}: "natural" "language" "processing" "semantic" "analysis" "introduction" "to" "semantic" "analysis" "the" "purpose" …………………… …………………… julia> close(f)
We can see from the above output that wordlist is now an array of 984 elements.
We can create a dictionary to store the words and word count −
julia> wordcounts = Dict{String,Int64}() Dict{String,Int64}() julia> for word in wordlist wordcounts[word]=get(wordcounts, word, 0) + 1 end
To find out how many times the words appear, we can look up the words in the dictionary as follows −
julia> wordcounts["natural"] 1 julia> wordcounts["processing"] 1 julia> wordcounts["and"] 14
We can also sort the dictionary as follows −
julia> for i in sort(collect(keys(wordcounts))) println("$i, $(wordcounts[i])") end 1, 2 2, 2 3, 2 4, 2 5, 1 a, 28 about, 3 above, 2 act, 1 affixes, 3 all, 2 also, 5 an, 5 analysis, 15 analyze, 1 analyzed, 1 analyzer, 2 and, 14 answer, 5 antonymies, 1 antonymy, 1 application, 3 are, 11 … … … …
To find the most common words we can use collect() to convert the dictionary to an array of tuples and then sort the array as follows −
julia> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true) 276-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14 "in" => 14 "be" => 13 "it" => 13 "example" => 13 "or" => 12 "word" => 12 "for" => 11 "are" => 11 "between" => 11 "as" => 11 ⋮ "each" => 1 "river" => 1 "homonym" => 1 "classification" => 1 "analyze" => 1 "nocturnal" => 1 "axis" => 1 "concept" => 1 "deals" => 1 "larger" => 1 "destiny" => 1 "what" => 1 "reservation" => 1 "characterization" => 1 "second" => 1 "certitude" => 1 "into" => 1 "compound" => 1 "introduction" => 1
We can check the first 10 words as follows −
julia> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true)[1:10] 10-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14
We can use filter() function to find all the words that start with a particular alphabet (say ’n’).
julia> filter(tuple -> startswith(first(tuple), "n") && last(tuple) < 4, collect(wordcounts)) 6-element Array{Pair{String,Int64},1}: "none" => 2 "not" => 3 "namely" => 1 "name" => 1 "natural" => 1 "nocturnal" => 1
Sets
Like an array or dictionary, a set may be defined as a collection of unique elements. Following are the differences between sets and other kind of collections −
-
In a set, we can have only one of each element.
-
The order of element is not important in a set.
Creating a Set
With the help of Set constructor function, we can create a set as follows −
julia> var_color = Set() Set{Any}()
We can also specify the types of set as follows −
julia> num_primes = Set{Int64}() Set{Int64}()
We can also create and fill the set as follows −
julia> var_color = Set{String}(["red","green","blue"]) Set{String} with 3 elements: "blue" "green" "red"
Alternatively we can also use push!() function, as arrays, to add elements in sets as follows −
julia> push!(var_color, "black") Set{String} with 4 elements: "blue" "green" "black" "red"
We can use in() function to check what is in the set −
julia> in("red", var_color) true julia> in("yellow", var_color) false
Standard operations
Union, intersection, and difference are some standard operations we can do with sets. The corresponding functions for these operations are union(), intersect() and setdiff().
Union
In general, the union (set) operation returns the combined results of the two statements.
Example
julia> color_rainbow = Set(["red","orange","yellow","green","blue","indigo","violet"]) Set{String} with 7 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "red" julia> union(var_color, color_rainbow) Set{String} with 8 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "black" "red"
Intersection
In general, an intersection operation takes two or more variables as inputs and returns the intersection between them.
Example
julia> intersect(var_color, color_rainbow) Set{String} with 3 elements: "blue" "green" "red"
Difference
In general, the difference operation takes two or more variables as an input. Then, it returns the value of the first set excluding the value overlapped by the second set.
Example
julia> setdiff(var_color, color_rainbow) Set{String} with 1 element: "black"
Some Functions on Dictionary
In the below example, you will see that the functions that work on arrays as well as sets also works on collections like dictionaries −
julia> dict1 = Dict(100=>"X", 220 => "Y") Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" julia> dict2 = Dict(220 => "Y", 300 => "Z", 450 => "W") Dict{Int64,String} with 3 entries: 450 => "W" 220 => "Y" 300 => "Z"
Union
julia> union(dict1, dict2) 4-element Array{Pair{Int64,String},1}: 100 => "X" 220 => "Y" 450 => "W" 300 => "Z"
Intersect
julia> intersect(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 220 => "Y"
Difference
julia> setdiff(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 100 => "X"
Merging two dictionaries
julia> merge(dict1, dict2) Dict{Int64,String} with 4 entries: 100 => "X" 450 => "W" 220 => "Y" 300 => "Z"
Finding the smallest element
julia> dict1 Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" julia> findmin(dict1) ("X", 100)
”;