Skip to content
This repository has been archived by the owner on Sep 1, 2023. It is now read-only.

Capacity Test Results

Subutai Ahmad edited this page Jan 13, 2015 · 51 revisions

January 5, 2015

Motivation and Goals

Why are we doing this? Why is it important? What would we like to see? How does this impact future work?

Description

This is an experiment that tests the capacity of the sensorimotor inference and temporal pooling algorithms.

Why do we use random SDRs for each element? Is this a problem?

The setup for the experiment is as follows:

  1. Create N worlds, each made up of M elements. Each element is a random (and unique) SDR. For example, if N=3 and M=4, the worlds would look like: ABCD, EFGH, IJKL. This is a generalized sensorimotor learning problem.
  2. An agent can move around a world and perceive elements.
  3. Set up two HTM layers, one sensorimotor layer 4, and one temporal pooling layer 3 (which pools over layer 4). When the agent perceives an element, its SDR activates columns in layer 4.
  4. As the agent moves around the different worlds, record the activations of layer 4 and layer 3.
  5. Train the layers by having the agent do an exhaustive sweep for each world, moving from every element to every other element in the world, with resets in between worlds.
  6. Test by having the agent move for some number of iterations randomly within each world, with resets in between worlds.
  7. Characterize the representations formed in the temporal pooling layer of each world as the agent moves around. We are interested in two metrics: stability and distinctness (see section "Metrics" below).
  8. Fix M, and increase N. Record the effect on stability and distinctness.
  9. Fix N, and increase M. Record the effect on stability and distinctness.

Metrics

Why choose these two metrics?

Stability

Stability is measured by looking at the temporal pooler representation at each iteration as the agent moves around a particular world, and comparing against every other iteration in that same world. Stability confusion is represented as the number of cells active minus the number of cells overlapping between the representations. Perfect stability is a confusion value of 0.

Distinctness

Distinctness is measured by looking at the temporal pooler representation at each iteration as the agent moves around a particular world, and comparing against every other iteration in every other world. Distinctness confusion is represented as the number of cells overlapping between the representations. Perfect distinctness is a confusion value of 0.

Experiment

Global parameters

tmParams:
    cellsPerColumn: 8
    initialPermanence: 0.5
    connectedPermanence: 0.6
    permanenceIncrement: 0.1
    permanenceDecrement: 0.02
    maxSegmentsPerCell: 255
    maxSynapsesPerSegment: 255
tpParams:
    synPermInactiveDec: 0
    synPermActiveInc: 0.001
    synPredictedInc: 0.5

1024 columns, strict parameters

Parameters

n: 1024
w: 20
tmParams:
  columnDimensions: [1024]
  minThreshold: 40
  activationThreshold: 40
  maxNewSynapseCount: 40
tpParams:
  columnDimensions: [1024]
  numActiveColumnsPerInhArea: 20
  potentialPct: 0.9
  initConnectedPct: 0.5

Results

1024_strict.xlsx

2 worlds, increasing # elements

10 elements, increasing # worlds

1024 columns, reasonable parameters (v1)

Parameters

n: 1024
w: 20
tmParams:
  columnDimensions: [1024]
  minThreshold: 9
  activationThreshold: 12
  maxNewSynapseCount: 20
tpParams:
  columnDimensions: [1024]
  numActiveColumnsPerInhArea: 20
  potentialPct: 0.9
  initConnectedPct: 0.5
  poolingThreshUnpredicted: 0.4

Results

1024_reasonable1.xlsx

2 worlds, increasing # elements

10 elements, increasing # worlds

1024 columns, reasonable parameters (v2)

Parameters

n: 1024
w: 20
tmParams:
  columnDimensions: [1024]
  minThreshold: 15
  activationThreshold: 20
  maxNewSynapseCount: 30
tpParams:
  columnDimensions: [1024]
  numActiveColumnsPerInhArea: 20
  potentialPct: 0.9
  initConnectedPct: 0.5
  poolingThreshUnpredicted: 0.4

Results

1024_reasonable2.xlsx

2 worlds, increasing # elements

Note: Only ran up to 80 elements, because any higher than that hit the synapse limit. Just be aware that the scale for this chart is different from the others.

10 elements, increasing # worlds

1024 columns, reasonable parameters (v3)

Parameters

n: 1024
w: 20
tmParams:
  columnDimensions: [1024]
  minThreshold: 20
  activationThreshold: 20
  maxNewSynapseCount: 30
tpParams:
  columnDimensions: [1024]
  numActiveColumnsPerInhArea: 20
  potentialPct: 0.9
  initConnectedPct: 0.5
  poolingThreshUnpredicted: 0.4

Results

1024_reasonable3.xlsx

2 worlds, increasing # elements

10 elements, increasing # worlds

2048 columns, reasonable parameters (v1)

Parameters

n: 2048
w: 40
tmParams:
  columnDimensions: [2048]
  minThreshold: 40
  activationThreshold: 40
  maxNewSynapseCount: 60
tpParams:
  columnDimensions: [2048]
  numActiveColumnsPerInhArea: 40
  potentialPct: 0.9
  initConnectedPct: 0.5
  poolingThreshUnpredicted: 0.4

Results

Note: Notice that the scale for these charts is different from the others, since it skips by 20 instead of by 10. This was to speed up the tests.

2048_reasonable1.xlsx

2 worlds, increasing # elements

10 elements, increasing # worlds

2048 columns, reasonable parameters (v2)

Parameters

n: 2048
w: 40
tmParams:
  columnDimensions: [2048]
  minThreshold: 20
  activationThreshold: 20
  maxNewSynapseCount: 30
tpParams:
  columnDimensions: [2048]
  numActiveColumnsPerInhArea: 40
  potentialPct: 0.9
  initConnectedPct: 0.5
  poolingThreshUnpredicted: 0.4

Results

Note: Notice that the scale for these charts is different from the others, since it skips by 20 instead of by 10. This was to speed up the tests.

2048_reasonable2.xlsx

2 worlds, increasing # elements

10 elements, increasing # worlds

Discussion

Are these good results? Are they bad?

It took several iterations of parameter tuning to find a good set of parameters, and there may be a better set yet. But the best results so far came from 1024 columns, reasonable parameters (v3). With this parameter set, we can handle up to 2 worlds, 90 elements and up to 70 worlds, 10 elements with max stability confusion and max distinctness confusion less than or equal to 6 (out of 20 cells).

With the strict parameter set, max stability confusion would spike to 20 at a discrete point. In the experiment increasing the number of elements per world, this point was the same point at which there was any bursting columns in layer 4. This makes sense because the strict parameter set causes the temporal pooler to stop pooling the moment there is any bursting at all. And bursting happens when layer 4 runs out of segments on its cells, and has to start deleting old segments to make room for new ones. In the experiment increasing the number of worlds, there was no bursting, since there aren't as many transitions per worlds for layer 4 to represent and therefore doesn't run out of segments. In fact, layer 4 was predicting perfectly the whole time. Therefore stability confusion spiked purely because of layer 3 running out of capacity; however it is not yet clear exactly why.

Going to 2048 columns (and scaling all the parameters up appropriately) didn't seem to make a huge difference in capacity. It still broke around the same places, whether using the same activation thresholds as in the 1024 column experiments, or using double the thresholds. This is somewhat surprising, especially for the experiment increasing the number of worlds, since it begs the question: what exactly is happening in layer 3 when capacity runs out as the number of worlds increases?

Add some notes about speed optimization. Where are the bottlenecks? What did you do and what is left to do?

What do you think is left to do? e.g. discuss the online learning issue.