4

We are taught that the abstraction of the RAM memory is a long array of bytes. And that for the CPU it takes the same amount of time to access any part of it. What is the device that has the ability to access any byte out of the 4 gigabytes (on my computer) in the same time? As this does not seem as a trivial task for me.

I have asked colleagues and my professors, but nobody can pinpoint to the how this task can be achieved with simple logic gates, and if it isn't just a tricky combination of logic gates, then what is it?

My personal guess is that you could achieve the access of any memory in O(log(n)) speed, where n would be the size of memory. Because each gate would split the memory in two and send you memory access instruction to the next split the memory in two gate. But that requires ALOT of gates. I can't come up with any other educated guess, and I don't even know the name of the device that I should look up in Google.

Please help my anguished curiosity, and thanks in advance.

edit< This is what I learned!

quote from yours "the RAM can send the value from cell addressed X to some output pins", here is where everyone skip (again) the thing that is not trivial for me. The way that I see it, In order to build a gate that from 64 pins decides which byte out of 2^64 to get, each pin needs to split the overall possible range of memory into two. If bit at index 0 is 0 -> then the address is at memory 0-2^64/2, else address is at memory 2^64/2-2^64. And so on, However the amout of gates (lets call them) that the memory fetch will go through will be 64, (a constant). However the amount of gates needed is N, where N is the number of memory bytes there are.

Just because there is 64 pins, it doesn't mean that you can still decode it into a single fetch from a range of 2^64. Does 4 gigabytes memory come with a 4 gigabytes gates in the memory control???

now this can be improved, because as I read furiously more and more about how this memory is architectured, if you place the memory into a matrix with sqrt(N) rows and sqrt(N) columns, the amount of gates that a fetch memory will need to go through is O(log(sqrt(N)*2) and the amount of gates that will be required will be 2*sqrt(N), which is much better, and I think that its probably a trade secret.

/edit<

Helical
  • 125
  • 9
  • 3
    This question appears to be off-topic because it is about hardware architecture –  Jan 06 '14 at 22:22
  • What is n? The size of memory? – Attila Jan 06 '14 at 22:25
  • 1
    If its off-topic, then which topic is it out off? I wasn't aware of a supposed to be topic, first question ever in here btw. – Helical Jan 06 '14 at 22:25
  • @user3167049 See [help/on-topic] for what is on-topic here. –  Jan 06 '14 at 22:32
  • 3
    Actually the best asymptotic access speed is something like `O(sqrt(n))`. Our universe is 3D, but the maximum information density is proportional to the surface area (not volume) of the space (see http://physics.stackexchange.com/questions/2281/). And information cannot travel faster than light. But in practice, we sweep all of these concerns under the rug and assume constant-access RAM because (a) it's close enough in practice and (b) it still allows us to compare the relative merits of competing algorithms. – Nemo Jan 06 '14 at 22:35
  • hmmm, then I do fall into the category of off-topic. So what. – Helical Jan 06 '14 at 22:36
  • Actually I think today DRAM is NOT random access, data is read in bursts so not random. The name just stuck... – Brant Unger Jan 06 '14 at 22:38
  • sqrt(n), isn't log(n) faster then sqrt(n)? also I dont care how dense information is in my question, even though its very important for all I care 4 gigabytes can fit in my fridge, all I want to know is what gets the instruction to read from 0x7a883bbc362f6d and does it, how long and weather its truly a random access? – Helical Jan 06 '14 at 22:42
  • It takes a basic circuit in computer design, a [demultiplexer](http://en.wikipedia.org/wiki/Demultiplexer). Ask more questions about it at electronics.stackexchange.com – Hans Passant Jan 06 '14 at 23:18
  • RAM *is* memory. So it's as fast as memory is. Memory is, in fact, limited by logN like pretty much everything else, it's just that the N is a constant that assumes the largest installable memory. – Hot Licks Jan 07 '14 at 00:04
  • @Nemo that is deep... – OutFall Jan 07 '14 at 00:14
  • @user3167049: Yes, log(n) is faster than sqrt(n), just like "constant" is faster than log(n). That's the whole point. If you are going to think about physical circuits, then you might as well think about physics... And physics says the access time can be no better than O(sqrt(n)) as n gets large. (And yes, you absolutely do care about information density, because the farther away a bit is, the longer it takes to access.) – Nemo Jan 07 '14 at 00:22
  • @HotLicks RAM is random access memory and it has a constant time, only the delay is related to propagation time, so time which it takes the signal to get through the circuit. – 3yakuya Jan 07 '14 at 00:28
  • So, "constant" may sound wrong as practically the propagation time depends on the number of pins (longer connections and more gates...) so it somehow does depend on the size, but this is not like time-complexity related to some algorithm. This is plain elctronics (and so - physics). – 3yakuya Jan 07 '14 at 00:48
  • @Byakuya - *AS INSTALLED*, RAM has a constant cycle time. But this is only because once the RAM has been configured into a system the "N" in the log(N) has been fixed. Rest assured that one can design a, say, 1 meg RAM system to be significantly faster than a 1 gig RAM system, even if the same actual RAM chips are used in both. The "complexity" is there, it's just (somewhat artificially) buried in constants. – Hot Licks Jan 07 '14 at 01:28
  • I get your idea now. I believe my previous comment here, as well as my answer, I expressed the same idea. – 3yakuya Jan 07 '14 at 01:32
  • This is a very good question however accessing memory is not what gives O(1) access, its accessing the cache which gives O(1) access since a cache is like a hashtable. If its not in the cache then it takes longer but if it is in cache then its instant. Cache oblivious algorithms take advantage of this. – Har Mar 28 '17 at 18:00

4 Answers4

8

What the heck, I might as well make this an answer.

Yes, in the physical world, memory access cannot be constant time.

But it cannot even be logarithmic time. The O(log n) circuit you have in mind ultimately involves some sort of binary (or whatever) tree, and you cannot make a binary tree with constant-length wires in a 3D universe.

Whatever the "bits per unit volume" capacity of your technology is, storing n bits requires a sphere with radius O(n^(1/3)). Since information can only travel at the speed of light, accessing a bit at the other end of the sphere requires time O(n^(1/3)).

But even this is wrong. If you want to talk about actual limitations of our universe, our physics friends say the absolute maximum number of bits you can store in any sphere is proportional to the sphere's surface area, not its volume. So the actual radius of a minimal sphere containing n bits of information is O(sqrt(n)).

As I mentioned in my comment, all of this is pretty much moot. The models of computation we generally use to analyze algorithms assume constant-access-time RAM, which is close enough to the truth in practice and allows a fair comparison of competing algorithms. (Although good engineers working on high-performance code are very concerned about locality and the memory hierarchy...)

Community
  • 1
  • 1
Nemo
  • 70,042
  • 10
  • 116
  • 153
  • Does it even make sense for you guys to be using big-O? What does n represent in these situations? – Celeritas Dec 16 '15 at 04:05
  • @Celeritas: Yes it makes sense. In this context, n is the total size of the memory in bits. (That is, the total number of possible addresses.) – Nemo Dec 16 '15 at 13:15
  • @Celeritas: This is not totally academic, by the way. Accessing a 32k memory is faster than a 4G memory, always. This is why, even with an infinite budget, you would not want a huge L1 cache (for example). It is also why most of these other answers are wrong. This is what I was getting at with my "memory hierarchy" comment. Computers implementing the RAM model will always have a memory hierarchy, from now until the end of time. So it wise to write code that caters to them if you care about speed. – Nemo Dec 16 '15 at 20:57
2

Let's say your RAM has 2^64 cells (places where it is possible to store a single value, let's say 8-bit). Then it needs 64 pins to address every cell with a different number. When at the input pins of your RAM there 'appears' a binary number X the RAM can send the value from cell addressed X to some output pins, and your CPU can get the value from there. In hardware the addressing can be done quite easily, for example by using multiple NAND gates (such 'addressing device' from some logic gates is called a decoder).

So it is all happening at the hardware-level, this is just direct addressing. If the CPU is able to provide 64 bits to 64 pins of your RAM it can address every single memory cell (as 64 bit is enough to represent any number up to 2^64 -1). The only reason why you do not get the value immediately is a kind of 'propagation time', so time it takes for the signal to go through all the logic gates in the circuit.

3yakuya
  • 2,622
  • 4
  • 25
  • 40
0

The component responsible for dealing with memory accesses is the memory controller. It is used by the CPU to read from and write to memory.

The access time is constant because memory words are truly layed out in a matrix form (thus, the "byte array" abstraction is very realistic), where you have rows and columns. To fetch a given memory position, the desired memory address is passed on to the controller, which then activates the right column.

From http://computer.howstuffworks.com/ram1.htm:

Memory cells are etched onto a silicon wafer in an array of columns (bitlines) and rows (wordlines). The intersection of a bitline and wordline constitutes the address of the memory cell.

So, basically, the answer to your question is: the memory controller figures it out. Of course that, given a memory address, the mapping to column and row must be easy to calculate in a constant time.

To fully understand this topic, I recommend you to read this guide on how memory works: http://computer.howstuffworks.com/ram.htm

There are so many concepts to master that it is difficult to explain it all in one answer.

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
  • thank you, I learned alot, however notice that my question remains the same. For if indeed memory is organized into a matrix, with rows and columns. Then like our good friend @Nemo said, its O(sqrt(n)) of speed or gates. Because if you have 4 gigabytes memory then you will have sqrt(4*10^9) rows, and columns. So upon a given address of 0x(*&&^(*& whatever, the row/column decoder still needs to go through sqrt(4*10^9) choices, how does it do that? and dont tell me that its another matrix :) – Helical Jan 06 '14 at 22:57
  • @user3167049 - The limitation is logN (N being the largest possible address), because an address decoder has a number of gate levels proportional to that. – Hot Licks Jan 07 '14 at 00:07
0

I've been reading your comments and questions until I answered. I think you are on the right track, but there is some confusion here. The random access in which you are implying doesn't exist in the same way you think it does.

Reading, writing, and refreshing are done in a continuous cycle. A particular cell in memory is only read or written in a certain interval if a signal is detected to do so in that cycle. There is going to be support circuitry that includes "sense amplifiers to amplify the signal or charge detected on a memory cell."

Unless I am misunderstanding what you are implying, your confusion is in how easy it is to read/write to a cell. It's different dependent on chip design but there IS a minimum number of cycles it takes to read or write data to a cell.

These are my sources:

http://www.doc.ic.ac.uk/~dfg/hardware/HardwareLecture16.pdf
http://www.electronics.dit.ie/staff/tscarff/memory/dram_cycles.htm
http://www.ece.cmu.edu/~ece548/localcpy/dramop.pdf

To avoid a humungous answer, I left most of the detail out but all three of these will describe the process you are looking for.

Brant Unger
  • 403
  • 2
  • 11