That's a pretty broad question - there are lots of microcontroller architectures out there.
I can start answering it by telling you that most microcontrollers use what is known as a "Harvard architecture", which means that program space and data space are on different internal memory buses. This is a speed optimization which allows instructions and data to be accessed in parallel (i.e., simultaneously). It also means that the processor cannot execute instructions from data space.
Desktop and other general-purpose computers typically use what is known as a "Von Neumann architecture", which has a single address space for both instructions and data.
Von Neumann architectures are much more flexible, and so they are well-suited to general computing where many different programs can be loaded and executed dynamically. But they are also subject to errors caused by bugs which lead to the execution of data as instructions. Viruses exploit this weakness by disguising their code as data and then causing the processor to execute it.
Microcontrollers, on the other hand, are not intended for general computing. They are designed to be embedded in a purpose-built device, and committed to a specific set of tasks. Harvard architectures are well-suited to this, since a non-volatile, read-only program space is highly desirable - programming errors (or viruses) cannot modify or damage the program, power failures cannot erase the program, etc.
There are far too many microcontrollers for me to catalog, but off the top of my head I can tell you that the Atmel AVR family and the Intel 8051 family are both Harvard architecture.