CrX Architecture Part-3
[CrX Architecture Cont..]
The return
When the second network's output was returned
as input to the first network, this was done to update the clusters in the
first network. The next time when the input comes into the model, the updated
clusters will correctly understand the input in a very short time.
Math behind the return
The model outputs the pixel values. The return
of the output as input to the first network is to store the closely related
clusters in the same layer of the first network, which would not be possible
without returning the output. The returning-of-the-output stage contains the
finalizer function where this function takes the output and finds similarities
between the output for each output. If it checks with the previous accumulated
outputs and if similarities are the same for at least 3 outputs from the second
network, then this output is returned to the first network as input.
Ø Purpose of returning the output as input -
This concept was created to combine two related clusters in two
different layers. The destabilizing mechanism will act as an algorithm that
increases the smallest truth cluster in each layer. This makes each output from
the model truly manifest reality, even if the output is totally irrelevant; the
output will still manifest reality (good, but may not be correct).
Additionally, the destabilizing mechanism will mix the smallest truths,
enhancing the problem-solving abilities of this model.
The reason for returning the output as input to the first network is
that the raw pixel input may or may not activate all the required clusters, as
required clusters may exist in different layers which cannot be connected
simultaneously because cluster connection is possible at one layer at a time.
Therefore, there must be a mechanism to take the outputs from two layers and
combine them into one (with 60% of the required cluster in one layer and 40% in
another layer). A "Required cluster" is a group of clusters that
provide output without any error. The "Wiggle Connection" is the
mechanism that finds the required clusters in different layers and combines
them into one.
The reason for returning output is to reduce noise and increase signal, because the output is a result of reduced-correct-important information (that was specific to the model). If we give this reduced information to the model again, this reduced information will specifically activate only the other correct clusters that are necessary to complete the full output. For example, if I give an input like D, A, E, R, B, then the model with alphabet knowledge outputs A. If this A output is sent back as input, the model now gives B. The outputs are the only correct questions that the model will understand to give a more clear answer. In this way, by many iterations, the complete output (at least what is known to the model) can be achieved. Once again, how the output will be handled as input in the network and their influence in the network.
When the finalized output enters the first
network, it is treated as input again, and all processes like partitioning,
clustering, and looping occur in both the first and second networks. It is
assumed that the raw input will not activate a certain set of clusters present
in each different layer. Therefore, this output, treated as input, collects all
the necessary inputs that the model can understand in its own language,
activating all the right clusters faster. Thus, this process serves as a way to
update the clusters in the first network layers.
Ø How the return will help us reduce wrong output rate and increase right output rate, and how it will differentiate the right and wrong output?
The return will save the combined right
cluster from the wiggle connection in the second network layers. The second
network gives finalized output that has the right set of combined clusters
collected from all the layers of the first network through the wiggle
connection. This perfectly altered clustered randomness, as the final output,
will be returned as input to the first network. This effect stores the right
set of clustered vectors combined through many loops in the first network
layers. The next time the same input is given, the first network produces
output without many loop iterations. This reduces time and cost. The probability
of the right output depends on how well the model stores the smallest truth.
However, there is a little contribution from this return function: it
eliminates much noise in the model by outputting only the not varying much
output. That is, the outputs from the second network layer should not differ
much; they should be the same for at least two outputs. This matching of output
will eliminate much noise since it was stored in the first network layers. The
errors are permanently being removed. That is how this function helps in
generating a good output.
Ø Biologically - inspired explanation
The returning impulse will only be of the highest intensity impulse that entered the brain. Therefore, only sustained impulses get as a returning impulse, equivalent to the model having the same output two or more times. This repetition of the same output two or more times is equivalent to a high-intensity impulse entering the brain. This is important for the brain to solve the problem.
The encoder and decoder
This model needs an encoder and decoder because the model processes input impulses by simply changing direction. The model does not know which or what those impulses represent within the model. The model just receives and delivers what those impulses demand, such as clustering and changing direction. The encoder and decoder understand what each impulse represents, and that's why this model needs an encoder and decoder.
Ø Encoder
For the prototype, simple images like MNIST will be used. The encoder will consist of a function that extracts pixel values from the raw image. This pixel values will then be fed into a trainable distribution density function, which will scale the resultant pixel values to give importance only to the important locations of the inputs. The distribution will take on a multi-modal distribution after training, where peaks or high-density regions correspond to the important regions in the input pixel values. These high-density regions will scale the feature maps in a way to give them more priority when processed inside the network, while low-density regions will scale the features to give them lower priority during processing. Since the distribution is trainable, it is adaptable to each kind of input category.
Math behind the encoder
The encoder will convert the image into a pixel values. A trainable density distribution function will scale the values of the pixel values.
Ø Decoder
The decoder will contain the same trained
distribution density function (taking the already trained distribution density
function from the encoder), and this function will de-scale the output scaled
pixel values. This process is then subjected to a decoder function to obtain
the deconstruction of the output for visualizing the input image.
Math behind the decoder
The output consists of the values of the
scaled feature map, which are descaled by the density distribution function. Decoder
to convert pixel values into an image.
Ø The sync
The encoder and decoder share the same
continuously trainable distribution density function that correctly scales/descales
the pixel values. Their synchronization is very important so that the
destabilizing mechanism will work correctly, and the stored clusters will
represent the smallest truth of reality.
Ø Biologically - inspired explanation
The brain receives impulses from the external environment by selectively prioritizing some pieces of input from the overall input. Over time, every piece becomes prioritized if we continuously receive that particular type of input. However, occasionally viewed types of inputs are selectively processed. The input regions, like the rods and cones in the eyes, and other receptors that receive input, contain dense neurons that can be attracted to the incoming impulses from the external environment. Naturally, impulses initiate the synthesis of neurochemicals that assist in making synapses or connections. This process leads to neurons becoming denser in specific locations on the input region. This denser region collects more impulses from the same external environment (any particular spot on the overall image), which is useful for propelling the impulse a long way into the brain for computation. Additionally, the denser neurons come closer together so that they can connect with each other, allowing the receptor to activate itself after being activated by the external environment.
This setup is done to fully propagate the impulse and assist in the long journey it has to travel in the brain. The encoder and decoder are set up in a way to form these denser neurons at both the encoder and decoder regions in the form of distribution density function. Reactivation is not necessary for artificial models, and selectivity of specific pieces is important. Therefore, the concept of forming denser neurons was applied in this model. Selectivity is needed to form clusters in the network. (Refer to BrainX theory for further details).
We will include some kind of distribution to
indicate the strength of the impulse. We will store the distribution of each pixel
vector to assign specific priorities to each chained neuron. If one input
neuron contains the highest distribution parameter, then its value will be
higher. This is analogous to what happens in the brain: higher connections in
the input neuron will have higher intensity and get activated to give a
specific output. For each colour, there is a predefined intensity (predefined
neurons accumulated because of that limited intensity) determined by our eye
cells. Above and below that range is not suitable for detecting the colour.
Therefore, having a trainable distribution will act like that: more
distribution means more neurons (as neurons are attracted to impulses, the
amount of impulse determines the attraction of neurons) in that place which
detects a particular specific colour.
Comments
Post a Comment