www.digitalmars.com

D Programming Language 2.0


Last update Sun Dec 30 20:21:55 2012

Associative Arrays

Associative arrays have an index that is not necessarily an integer, and can be sparsely populated. The index for an associative array is called the key, and its type is called the KeyType.

Associative arrays are declared by placing the KeyType within the [ ] of an array declaration:

int[string] b;    // associative array b of ints that are
                  // indexed by an array of characters.
                  // The KeyType is string
b["hello"] = 3;   // set value associated with key "hello" to 3
func(b["hello"]); // pass 3 as parameter to func()

Particular keys in an associative array can be removed with the remove function:

b.remove("hello");

The InExpression yields a pointer to the value if the key is in the associative array, or null if not:

int* p;
p = ("hello" in b);
if (p !is (B null))
        ...

KeyTypes cannot be functions or voids.

The element types of an associative array cannot be functions or voids.

Using Classes as the KeyType

Classes can be used as the KeyType. For this to work, the class definition must override the following member functions of class Object:

hash_t is an alias to an integral type.

Note that the parameter to opCmp and opEquals is of type Object, not the type of the class in which it is defined.

For example:

class Foo {
  int a, b;

  hash_t toHash() { return a + b; }

  bool opEquals(Object o)
  { Foo foo = cast(Foo) o;
    return foo && a == foo.a && b == foo.b;
  }

  int opCmp(Object o)
  { Foo foo = cast(Foo) o;
    if (!foo)
      return -1;
    if (a == foo.a)
      return b - foo.b;
    return a - foo.a;
  }
}

The implementation may use either opEquals or opCmp or both. Care should be taken so that the results of opEquals and opCmp are consistent with each other when the class objects are the same or not.

Using Structs or Unions as the KeyType

If the KeyType is a struct or union type, a default mechanism is used to compute the hash and comparisons of it based on the binary data within the struct value. A custom mechanism can be used by providing the following functions as struct members:

const hash_t toHash();
const bool opEquals(ref const KeyType s);
const int opCmp(ref const KeyType s);

For example:

import std.string;

struct MyString {
  string str;

  const hash_t toHash()
  { hash_t hash;
    foreach (char c; str)
      hash = (hash * 9) + c;
    return hash;
  }

  const bool opEquals(ref const MyString s)
  {
    return std.string.cmp(this.str, s.str) == 0;
  }

  const int opCmp(ref const MyString s)
  {
    return std.string.cmp(this.str, s.str);
  }
}

The implementation may use either opEquals or opCmp or both. Care should be taken so that the results of opEquals and opCmp are consistent with each other when the struct/union objects are the same or not.

Runtime Initialization of Immutable AAs

Immutable associative arrays are often desirable, but sometimes initialization must be done at runtime. This can be achieved with a constructor (static constructor depending on scope), a buffer associative array and assumeUnique:

immutable long[string] aa;

static this()
{
    import std.exception : assumeUnique;
    import std.conv : to;

    long[string] temp; // mutable buffer
    foreach(i; 0 .. 10)
    {
        temp[to!string(i)] = i;
    }
    temp.rehash; // for faster lookups

    aa = assumeUnique(temp);
}

unittest
{
    assert(aa["1"] == 1);
    assert(aa["5"] == 5);
    assert(aa["9"] == 9);
}

Properties

Properties for associative arrays are:
Associative Array Properties
Property Description
.sizeof Returns the size of the reference to the associative array; it is 4 in 32-bit builds and 8 on 64-bit builds.
.length Returns number of values in the associative array. Unlike for dynamic arrays, it is read-only.
.keys Returns dynamic array, the elements of which are the keys in the associative array.
.values Returns dynamic array, the elements of which are the values in the associative array.
.rehash Reorganizes the associative array in place so that lookups are more efficient. rehash is effective when, for example, the program is done loading up a symbol table and now needs fast lookups in it. Returns a reference to the reorganized array.
.byKey() Returns a delegate suitable for use as an Aggregate to a ForeachStatement which will iterate over the keys of the associative array.
.byValue() Returns a delegate suitable for use as an Aggregate to a ForeachStatement which will iterate over the values of the associative array.
.get(Key key, lazy Value defaultValue) Looks up key; if it exists returns corresponding value else evaluates and returns defaultValue.

Associative Array Example: word count

Let's consider the file is ASCII encoded with LF EOL. In general case we should use dchar c for iteration over code points and functions from std.uni.

import std.file;         // D file I/O
import std.stdio;
import std.ascii;

void main (string[] args) {
  ulong totalWords, totalLines, totalChars;
  ulong[string] dictionary;

  writefln("   lines   words   bytes file");
  foreach (arg; args[1 .. $]) // for each argument except the first one
  {
    ulong wordCount, lineCount, charCount;

    foreach(line; File(arg).byLine())
    {
      bool inWord;
      size_t wordStart;

      void tryFinishWord(size_t wordEnd)
      {
        if (inWord)
        {
          auto word = line[wordStart .. wordEnd];
          ++dictionary[word.idup];   // increment count for word
          inWord = false;
        }
      }

      foreach (i, char c; line)
      {
        if (std.ascii.isDigit(c))
        { // c is a digit (0..9)
        }
        else if (std.ascii.isAlpha(c))
        { // c is an ASCII letter (A..Z, a..z)
          if (!inWord)
          {
            wordStart = i;
            inWord = true;
            ++wordCount;
          }
        }
        else
          tryFinishWord(i);
        ++charCount;
      }
      tryFinishWord(line.length);
      ++lineCount;
    }

    writefln("%8s%8s%8s %s", lineCount, wordCount, charCount, arg);
    totalWords += wordCount;
    totalLines += lineCount;
    totalChars += charCount;
  }

  if (args.length > 2)
  {
    writefln("-------------------------------------\n%8s%8s%8s total",
            totalLines, totalWords, totalChars);
  }

  writeln("-------------------------------------");
  foreach (word; dictionary.keys.sort)
  {
    writefln("%3s %s", dictionary[word], word);
  }
}




Forums | Comments |  D  | Search | Downloads | Home