# Change Dir

• 随笔 - 222
• 文章 - 0
• 评论 - 182
• 引用 - 0

• 积分 - 414550
• 排名 - 133

## 分享代码系列——HashedArrayTree

HashedArrayTree是一种可变动态数组结构，不像通用动态数组一样是线性结构。HashedArrayTree（以下简称HAT）内部维护了一个数组列表（或者说一个二维数组更好理解）

HAT扩容：

   1: /***************************************************************************
   2:  * File: HashedArrayTree.java
   3:  * Author: Keith Schwarz (htiek@cs.stanford.edu)
   4:  *
   5:  * An implementation of the List abstraction backed by a hashed array tree
   6:  * (HAT), a data structure supporting amortized O(1) lookup, append, and
   7:  * last-element removal.  In this sense it is akin to a standard dynamic
   8:  * array implementation.  However, a hashed array tree also has the advantage
   9:  * that its memory overhead is only O(sqrt(n)) rather than the typical O(n)
  10:  * found in dynamic arrays and their variants.
  11:  *
  12:  * Internally, the hashed array tree is implemented as an array of pointers
  13:  * that optionally point to an array of elements.  The topmost array and each
  14:  * element always have the same size, which is always a power of two.  In
  15:  * this sense, the hashed array tree is essentially a two-dimensional array
  16:  * of elements.  However, the advantage of the hashed array tree is that the
  17:  * topmost array pointers are all initially null and only filled in when space
  18:  * is needed.  This means that the maximum overhead of the structure is the
  19:  * size of the topmost array, plus the number of unused elements in the current
  20:  * block.  Here is one sample HAT:
  21:  *
  22:  *           [ ] [ ] [ ] [ ]
  23:  *            |   |   |
  24:  *            v   v   v
  25:  *             
  26:  *             
  27:  *             [ ]
  28:  *             [ ]
  29:  *
  30:  * Here, the topmost array of pointers has three pointers in use, each of which
  31:  * point to an array of the corresponding number of elements.
  32:  *
  33:  * Whenever an element is added to a hashed array tree, one of three cases
  34:  * must hold:
  35:  *
  36:  * 1. There is extra space at the end of the final subarray (for example, in
  37:  *    the top picture).  In that case, the element is added to that position.
  38:  * 2. The final subarray is full, but space for another subarray exists in the
  39:  *    topmost array.  In that case, a new array is allocated and the element
  40:  *    is added to that array.
  41:  * 3. The final subarray is full and no new arrays remain open.  In that case,
  42:  *    since the topmost array has size 2^n and each array has size 2^n, there
  43:  *    must be a total of 2^(2n) elements in the hashed array tree.  We
  44:  *    next double the size of the topmost array to 2^(n + 1), then allocate
  45:  *    2^(n - 1) subarrays of size 2^(n + 1) elements for a total of 2^(2n)
  46:  *    elements' worth of space.  The elements from the old HAT are then copied
  47:  *    over, a new array is allocated for the new element, and it is added as
  48:  *    the first element of that array.
  49:  *
  50:  * Let's now talk about the performance and memory usage of this structure.
  51:  * First, we note that we can perform lookups in O(1), assuming that the
  52:  * machine is transdichotomous (meaning that a single machine word can hold
  53:  * the size of any array).  This can be done by breaking the input index in
  54:  * half, then using the first half to choose which array to look into (the
  55:  * "hashed" part of HAT) and the second half to choose which index to select.
  56:  * This trick is similar to the trick used to represent a two-dimensional
  57:  * array using a linear structure.
  58:  *
  59:  * Next, let's think about how much time it takes to do an append operation.
  60:  * Each append when space remains takes O(1) to look up the proper position
  61:  * in the hashed array tree for writing, but appends can be much more expensive
  62:  * when the HAT needs to be resized.  Fortunately, this does not happen very
  63:  * often.  Whenever the HAT doubles in size, its capacity grows from 2^(2n) to
  64:  * 2^(2n + 2), meaning that four times as many elements must be inserted before
  65:  * the next copy operation.  If we define a potential function as twice the
  66:  * number of filled-in elements in the HAT above half capacity (i.e. the
  67:  * number of elements in the arrays in the second half), then we can prove an
  68:  * amortized O(1) time for append.  Consider any series of appends.  If the
  69:  * append does not expand the HAT, then it takes O(1) time and increases the
  70:  * potential by 1/2.  If the append does expand the HAT, and the HAT's topmost
  71:  * array has size 2^n, then there must be 2^(n-1) elements in the latter half
  72:  * of the HAT, so the potential is 2^(2n).  The time required to move each
  73:  * of the 2^(2n) elements is 2^(2n), and in the new HAT there are no elements
  74:  * in the latter half of the array.  Consequently, the new potential is zero.
  75:  * The actual time required to perform the append is thus 2^2n + O(1), and
  76:  * the decrease in potential is -2^2n, so the amortized cost is O(1) as
  77:  * expected.
  78:  *
  79:  * Last, let's talk about the cost to do a remove.  This is similar to the
  80:  * append case - we delete the last element of the last array, removing the
  81:  * array from the topmost array if it becomes empty.  We also compact the
  82:  * HAT if it becomes too sparse by shrinking from a HAT of size 2^(2n + 2) to
  83:  * a HAT of size 2^(2n) if the HAT becomes one-eighth full.  A similar
  84:  * potential method can be used to show that this operation runs in amortized
  85:  * O(1).
  86:  *
  87:  * Finally, let's consider the memory overhead of the HAT.  For any HAT of
  88:  * topmost array size 8 or more, since the HAT is always at least one-eighth
  89:  * full, there must be at least one full array.  This array has size equal to
  90:  * the size of the topmost array (call this k), and so if every array were to
  91:  * be filled in to capacity, there would be a total of k arrays of size k,
  92:  * for a total of k^2 elements.  Of this capacity, we know that at least an
  93:  * eighth of them are filled in, so k^2 must be at most 8n, and so
  94:  * k = O(sqrt(n)).  To finish the analysis, the overhead of the structure is
  95:  * at most the overhead of this top-level array, plus potentially k - 1
  96:  * unused elements in some array.  This is a total of O(k) = O(sqrt(n))
  97:  * overhead, which is what we originally desired.
  98:  */
  99: import java.util.*; // For AbstractList
 100:
 101: @SuppressWarnings("unchecked")
 102: public final class HashedArrayTree<T> extends AbstractList<T> {
 103:     /* To simplify the implementation, we enforce that the size of the topmost
 104:      * array never drops below 2.  This prevents weirdness when we try to
 105:      * allocate 2^(n-1) arrays during a doubling and find that n = 0.
 106:      */
 107:     private static final int kMinArraySize = 2;
 108:
 109:     /* The topmost array of elements; initially of size two. */
 110:     private T[][] mArrays = (T[][]) new Object[kMinArraySize][];
 111:
 112:     /* Number of elements, initially zero since the HAT is created empty. */
 113:     private int mSize = 0;
 114:
 115:     /* A constant containing lg2 of the topmost array size.  This enables some
 116:      * cute bit-twiddling tricks to improve efficiency.
 117:      */
 118:     private int mLgSize = 1;
 119:
 120:     /**
 121:      * Returns the number of elements in the HashedArrayTree.
 122:      *
 123:      * @return The number of elements in the HashedArrayTree.
 124:      */
 125:     @Override
 126:     public int size() {
 127:         return mSize;
 128:     }
 129:
 130:     /**
 131:      * Adds a new element to the HashedArrayTree.
 132:      *
 133:      * @param elem The element to add.
 134:      * @return true
 135:      */
 136:     @Override
 137:     public boolean add(T elem) {
 138:         /* First, check if we're completely out of space.  If so, do a resize
 139:          * to ensure we do indeed have room.
 140:          */
 141:         if (size() == mArrays.length * mArrays.length)
 142:             grow();
 143:
 144:         /* Compute the (arr, index) pair for the next position.  The next
 145:          * position is at the location indicated by size(), but we know that
 146:          * space exists from the previous call.
 147:          */
 148:         final int offset = computeOffset(size());
 149:         final int index  = computeIndex(size());
 150:
 151:         /* Check if an array exists here.  If not, make one up. */
 152:         if (mArrays[offset] == null)
 153:             mArrays[offset] = (T[]) new Object[mArrays.length];
 154:
 155:         /* Write the element to its location. */
 156:         mArrays[offset][index] = elem;
 157:
 158:         /* Update the element count. */
 159:         ++mSize;
 160:
 161:         /* Per the Collections contract, return true to signal a successful
 162:          * add.
 163:          */
 164:         return true;
 165:     }
 166:
 167:     /**
 168:      * Sets the element at the specified position to the indicated value.
 169:      * If the index is out of bounds, throws an IndexOutOfBounds exception.
 170:      *
 171:      * @param index The index at which to set the value.
 172:      * @param elem The element to store at that position.
 173:      * @return The value initially at that location.
 174:      * @throws IndexOutOfBoundsException If index is invalid.
 175:      */
 176:     @Override
 177:     public T set(int index, T elem) {
 178:         /* Find out where to look. */
 179:         final int offset   = computeOffset(index);
 180:         final int arrIndex = computeIndex(index);
 181:
 182:         /* Cache the value there and write the new one. */
 183:         T result = mArrays[offset][arrIndex];
 184:         mArrays[offset][arrIndex] = elem;
 185:
 186:         /* Hand back the old value. */
 187:         return result;
 188:     }
 189:
 190:     /**
 191:      * Returns the value of the element at the specified position.
 192:      *
 193:      * @param index The index at which to query.
 194:      * @return The value of the element at that position.
 195:      * @throws IndexOutOfBoundsException If the index is invalid.
 196:      */
 197:     @Override
 198:     public T get(int index) {
 199:         /* Check that this is a valid index. */
 200:         if (index < 0 || index >= size())
 201:             throw new IndexOutOfBoundsException("Index " + index + ", size " + size());
 202:
 203:         /* Look up the element. */
 204:         return mArrays[computeOffset(index)][computeIndex(index)];
 205:     }
 206:
 207:     /**
 208:      * Adds the specified element at the position just before the specified
 209:      * index.
 210:      *
 211:      * @param index The index just before which to insert.
 212:      * @param elem The value to insert
 213:      * @throws IndexOutOfBoundsException if the index is invalid.
 214:      */
 215:     @Override
 216:     public void add(int index, T elem) {
 217:         /* Confirm the validity of the index. */
 218:         if (index < 0 || index >= size())
 219:             throw new IndexOutOfBoundsException("Index " + index + ", size " + size());
 220:
 221:         /* Add a dummy element to ensure that everything resizes correctly.
 222:          * There's no reason to repeat the logic.
 223:          */
 224:         add(null);
 225:
 226:         /* Next, we need to shuffle down every element that appears after
 227:          * the inserted element.  We'll do this using our own public interface.
 228:          */
 229:         for (int i = size(); i > index; ++i)
 230:             set(i, get(i - 1));
 231:
 232:         /* Finally, write the element. */
 233:         set(index, elem);
 234:     }
 235:
 236:     /**
 237:      * Removes the element at the specified position from the HashedArrayTree.
 238:      *
 239:      * @param index The index of the element to remove.
 240:      * @return The value of the element at that position.
 241:      * @throws IndexOutOfBoundsException If the index is invalid.
 242:      */
 243:     @Override
 244:     public T remove(int index) {
 245:         /* Cache the value at the indicated position; this also does the bounds
 246:          * check.
 247:          */
 248:         T result = get(index);
 249:
 250:         /* Use a naive shuffle-down algorithm to reposition elements after
 251:          * the removed one.
 252:          */
 253:         for (int i = index + 1; i < size(); ++i)
 254:             set(i - 1, get(i));
 255:
 256:         /* Clobber the last element to play nicely with the garbage collector. */
 257:         set(size() - 1, null);
 258:
 259:         /* Decrement our size. */
 260:         --mSize;
 261:
 262:         /* If we are now at 1/8 total capacity, shrink the structure. */
 263:         if (size() * 8 <= mArrays.length * mArrays.length)
 264:             shrink();
 265:         /* Otherwise, if the size is now an even multiple of the array size,
 266:          * we can drop the very last array.  This is the array whose offset
 267:          * is one after the end of the elements.
 268:          */
 269:         else if (size() % mArrays.length == 0)
 270:             mArrays[computeOffset(size())] = null;
 271:
 272:         return result;
 273:     }
 274:
 275:     /**
 276:      * Given an index, returns the offset into the master array at which the
 277:      * element with that index can be found.
 278:      *
 279:      * @return The index into the topmost array where the given element can
 280:      *         be found.
 281:      */
 282:     private int computeOffset(int index) {
 283:         /* This can be computed by dividing the index by the index by the
 284:          * topmost array.  However, if we want to be very clever, we can do
 285:          * this efficiently by bit-shifting downard by the lg2 of the size
 286:          * of the topmost array.
 287:          */
 288:         return index >> mLgSize;
 289:     }
 290:
 291:     /**
 292:      * Given an index, returns the offset into the appropriate subarray in
 293:      * which the element with that index can be found.
 294:      *
 295:      * @return The index into the subarray array where the given element can
 296:      *         be found.
 297:      */
 298:     private int computeIndex(int index) {
 299:         /* This can be computed by modding the index by the index by the
 300:          * topmost array.  But we can do this more efficiently with a different
 301:          * tactic.  Since the array size is a perfect power of two, it must
 302:          * look like this:
 303:          *
 304:          * 00..010..0
 305:          *
 306:          * Subtracting one yields
 307:          *
 308:          * 00..001..1
 309:          *
 310:          * ANDing this with the index produces the value we're looking for.
 311:          */
 312:         return index & (mArrays.length - 1);
 313:     }
 314:
 315:     /**
 316:      * Grows the internal representation by doubling the size of the topmost
 317:      * array and copying the appropriate number of elements over.
 318:      */
 319:     private void grow() {
 320:         /* Double the size of the topmost array. */
 321:         T[][] newArrays = (T[][]) new Object[mArrays.length * 2][];
 322:
 323:         /* The new arrays each have size 2^(n + 1).  We need 2^(n - 1) of them
 324:          * to hold the old elements.  Allocate those here and copy everything
 325:          * over.
 326:          */
 327:         for (int i = 0; i < mArrays.length; i += 2) {
 328:             /* Allocate the array. */
 329:             newArrays[i / 2] = (T[]) new Object[newArrays.length];
 330:
 331:             /* Use System.arraycopy to move everything over. */
 332:             System.arraycopy(mArrays[i], 0, newArrays[i / 2], 0, mArrays.length);
 333:             System.arraycopy(mArrays[i + 1], 0, newArrays[i / 2], mArrays.length, mArrays.length);
 334:
 335:             /* Null out the old arrays to be nice to the GC during this
 336:              * potentially stressful time.
 337:              */
 338:             mArrays[i] = mArrays[i + 1] = null;
 339:         }
 340:
 341:         /* Switch out this new array for the old. */
 342:         mArrays = newArrays;
 343:
 344:         /* Bump up lg2 of the size. */
 345:         ++mLgSize;
 346:     }
 347:
 348:     /**
 349:      * Decreases the size of the HAT by shrinking into a better fit.
 350:      */
 351:     private void shrink() {
 352:         /* If the size of the topmost array is at its minimum, don't do
 353:          * anything.  This doesn't change the asymptotic memory usage because
 354:          * we only do this for small arrays.
 355:          */
 356:         if (mArrays.length == kMinArraySize) return;
 357:
 358:         /* Otherwise, we currently have 2^(2n) / 8 = 2^(2n - 3) elements.
 359:          * We're about to shrink into a grid of 2^(2n - 2) elements, and so
 360:          * we'll fill in half of the elements.
 361:          */
 362:         T[][] newArrays = (T[][]) new Object[mArrays.length / 2][];
 363:
 364:         /* Copy everything over.  We'll need half as many arrays as before. */
 365:         for (int i = 0; i < newArrays.length / 2; ++i) {
 366:             /* Create the arrays. */
 367:             newArrays[i] = (T[]) new Object[newArrays.length];
 368:
 369:             /* Move everything into it.  If this is an odd array, it comes
 370:              * from the upper half of the old array; otherwise it comes from
 371:              * the lower half.
 372:              */
 373:             System.arraycopy(mArrays[i / 2], (i % 2 == 0)? 0 : newArrays.length,
 374:                              newArrays[i], 0, newArrays.length);
 375:
 376:             /* Play nice with the GC.  If this is an odd-numbered array, we
 377:              * just copied over everything we needed and can clear out the
 378:              * old array.
 379:              */
 380:             if (i % 2 == 1)
 381:                 mArrays[i / 2] = null;
 382:         }
 383:
 384:         /* Copy the arrays over. */
 385:         mArrays = newArrays;
 386:
 387:         /* Drop the lg2 of the size. */
 388:         --mLgSize;
 389:     }
 390: }

http://www.keithschwarz.com/interesting/code/?dir=hashed-array-tree

http://en.wikipedia.org/wiki/Hashed_array_tree

posted on 2012-06-08 14:40 changedi 阅读(1578) 评论(0)  编辑  收藏 所属分类: 好代码分享 只有注册用户登录后才能发表评论。 网站导航: 相关文章: