2016 ~ Computer Organization & Architecture

Thursday, 15 December 2016

INPUT AND OUTPUT

Input and output

Introduction

In this blog I will discuss on input/output procedures&devices .External device are not generally connected directly int the bus structure of the computer.I/O module is an interface for the external devices (peripherals) to CPU and memory.

Input device

• Mouse

• ScanKeyboardner

• CD-Rom

• Game Controller

Output device

• Monitor

• Printers

• Disk Drive

• Floppy Drive

• CDRW-Rom

• Speakers

External device

• Modem

• NIC

• MONITOR

A model i/o configuration

Function of I/O modules

• Control and Timing.

• CPU Communicating.

• Device Communication.

• Data Buffering.

• Error Detection.

Control and Timing

• CPU asks I/O module to check the status of attached device.

• I/O module tells the status.

• CPU requests for data transfer to I/O module if device is ready.

• I/O module gathers the data and transfers to the CPU.

Cpu Communication

• Command Decoding : Like read/write seek etc. Data

Exchange between CPU and Module. Status reporting to CPU, since

peripherals are slow. Address recognition for the devices connected

to it.

Device Communication

• Device Communication : This may involves command, status

information and data transfer.

Data Buffering

• Data Buffering : Essential function to overcome speed mismatch.

Error Detection.

• Error Detection : Like paper jam, bad data etc.

TECHNIQUES OF I/O

• Programmed I/O : The CPU issues a command then waits for I/O

operations to be complete. The CPU is faster than the I/O module

then method is wasteful.

• Interrupt Driven I/O : The CPU issues commands then proceeds

with its normal work until interrupted by I/O device on completion

of its work.

• DMA : In this CPU and I/O Module exchange data without

involvement of CPU.

• Memory mapped I/O : Memory and I/O are treated as memory

only. It means no signal like IO/M.

Isolated I/O : Address space of memory and I/O is isolated.

It uses IO/M signal

Interrupts l/0

• CPU interrupt request line triggered by I/O devices

• Interrupt handler receives interrupt

• Maskable to ignore or delay some interrupts

• Interrupt vector to dispatch interrupt to correct handler

Based on priorty

Some unmaskable

• Interrupt mechanism also used for exceptions

Interrupt Processing

Interrupt driven I/O cycle

Evolution OF The I/O Funtion

1) CPU directly controls a peripheral device.
2) Use controller or i/o module.
3) Implement interrupts.
4) i/o module is given direct access to memory via DMA
5) i/o module is enhanced to become a processor
6) i/o module has a local memory of its own

Direct Memory Access

• Special Purpose Processor: DMA controller
• Free CPU from pure data transfer tasks
• DMA access: Pointer to source, destination and size of data issued to start transfer
• Processor writes the data DMA access data and continuous working
• Handshake protocol
• DMA request and DMA acknowledge
• DMA controllers are standard components in PCs
• Bus-mastering I/O hardware contain their own DMA hardware

DMA function

• Additional modul (hardware) on bus
• DMA controller takes over from CPU for I/O

DMA Transfer

I/O Transfer Mode

Serial
o In band signaling
o Bit oriented
o Bit/byte word translation

Parallel
o Byte word oriented
o Out of band signaling
o IDE, SCSI

SERIAL TRANSFER

• Asynchronous Clocking
o Master clock the transfer
o Slave derive clock from master

• Synchronous clocking
o Independent clocking
o Verification by synchronization pattern

PARALLEL TRANSFER

• Data transfer
o Read sector
o Write sector

• Control
o Disk seek

• Transfer Integrity
o Transfer parity
o Data encoding

The Processor

17:59 CHAPTER 7 No comments

Introduction

- A processor is the logic circuitry that responds to and processes the basic instructions that drive a computer. The four primary functions of a processor are fetch, decode, execute and writeback.

The basic elements of a Processor:

The arithmetic logic unit (ALU), which carries out arithmetic and logic operations on the operands in instructions.

The floating point unit (FPU), also known as a math coprocessor or numeric coprocessor, a specialized coprocessor that manipulates numbers more quickly than the basic microprocessor circuitry can.

Registers, which hold instructions and other data. Registers supply operands to the ALU and store the results of operations.

L1 and L2 cache memory. Their inclusion in the CPU saves time compared to having to get data from random access memory (RAM).

- Most processors today are multi-core, which means that the IC contains two or more processors for enhanced performance, reduced power consumption and more efficient simultaneous processing of multiple tasks (see: parallel processing)

- Multi-core set-ups are similar to having multiple, separate processors installed in the same computer, but because the processors are actually plugged into the same socket, the connection between them is faster.

- The term processor is used interchangeably with the term central processing unit (CPU), although strictly speaking, the CPU is not the only processor in a computer.

- The GPU(graphics processing unit) is the most notable example but the hard drive and other devices within a computer also perform some processing independently. Nevertheless, the term processor is generally understood to mean the CPU.

- The processor in a personal computer or embedded in small devices is often called a microprocessor. That term simply means that the processor's elements are contained on a single integrated circuitry (IC) chip.

The Central Processing Unit (CPU)

Figure 1 : The Central Processing Unit

- The computer does its primary work in a part of the machine we cannot see, a control center that converts data input to information output.

- This control center, called the central processing unit (CPU), is a highly complex, extensive set of electronic circuitry that executes stored program instructions.

- All computers, large and small, must have a central processing unit. As Figure 1 shows, the central processing unit consists of two parts: The control unit and the arithmetic/logic unit.

- Each part has a specific function.

- Before we discuss the control unit and the arithmetic/logic unit in detail, we need to consider data storage and its relationship to the central processing unit.

- Computers use two types of storage: Primary storage and secondary storage. The CPU interacts closely with primary storage, or main memory, referring to it for both instructions and data.

- For this reason this part of the reading will discuss memory in the context of the central processing unit. Technically, however, memory is not part of the CPU.

- Recall that a computer's memory holds data only temporarily, at the time the computer is executing a program.

- Secondary storage holds permanent or semi-permanent data on some external magnetic or optical medium. The diskettes and CD-ROM disks that you have seen with personal computers are secondary storage devices, as are hard disks.

- Since the physical attributes of secondary storage devices determine the way data is organized on them, we will discuss secondary storage and data organization together in another part of our on-line readings.

Components of the Central Processing Unit

1) The Control Unit

- The control unit of the CPU contains circuitry that uses electrical signals to direct the entire computer system to carry out, or execute, stored program instructions.

- The control unit does not execute program instructions; rather, it directs other parts of the system to do so. The control unit must communicate with both the arithmetic/logic unit and memory.

2) The Arithmetic/Logic Unit

- The arithmetic/logic unit (ALU) contains the electronic circuitry that executes all arithmetic and logical operations.

- The arithmetic/logic unit can perform four kinds of arithmetic operations, or mathematical calculations: addition, subtraction, multiplication, and division.

- As its name implies, the arithmetic/logic unit also performs logical operations. A logical operation is usually a comparison.

- The unit can compare numbers, letters, or special characters. The computer can then take action based on the result of the comparison.

- This is a very important capability. It is by comparing that a computer is able to tell, for instance, whether there are unfilled seats on airplanes, whether charge- card customers have exceeded their credit limits, and whether one candidate for Congress has more votes than another.

Logical operations can test for three conditions:

• Equal-to condition. In a test for this condition, the arithmetic/logic unit compares two values to determine if they are equal. For example: If the number of tickets sold equals the number of seats in the auditorium, then the concert is declared sold out.

• Less-than condition. To test for this condition, the computer compares values to determine if one is less than another. For example: If the number of speeding tickets on a driver's record is less than three, then insurance rates are $425; otherwise, the rates are $500.

• Greater-than condition. In this type of comparison, the computer determines if one value is greater than another. For example: If the hours a person worked this week are greater than 40, then multiply every extra hour by 1.5 times the usual hourly wage to compute overtime pay.

What is the function of the processor?

- The CPU has two main components, the arithmetic/logic unit and the control unit. The control unit directs other portions of the computer system to carry out or execute program instructions.

- The arithmetic/logic unit executes all the arithmetic and logical instructions.

The four core functions of the CPU are to fetch, decode, execute and store.

- The first thing the CPU does is to fetch the instruction from the program memory. It decodes the instruction, which is then moved to the arithmetic/logic unit.

- The arithmetic/logic unit then executes the instruction. At this stage, the processor can do one of three things: perform mathematical operations, move date from one memory location to another or jump to a bunch of new instructions based on the decision it makes.

- Finally, the arithmetic/logic unit stores the output of the operation into the computer's memory.

The performance and speed of the CPU depends on the clock rate, which is the speed at which a processor executes instructions, and the instructions per clock.

Memory Organization

06:18 CHAPTER 8 No comments

Introduction

-A memory or a data byte, or a word, or a double word, or a quad word may be accessed from or at all addressable locations with a similar process would be used to access from all locations and there is would be equal access time for a read or for a write that is independent of a memory address location. This mode differentiates from another model called serial access mode.

i ) Memory Hierarchy

Figure 1 : Pyramid Table

- Hierarchical Memory Organization

Figure 2

- How is the Hierarchy Managed?

· by compiler (or programmer) .

· by the cache controller hardware.

· by the operating system (virtual memory).

· virtual to physical address mapping assisted by the hardware(TLB).

· by the programmer (files).

ii ) Main Memory

Main memory organization

- The main memory stores instructions and data of the currently executed programs.

- Sometimes, its part can be implemented as the fixed memory or read-only memory ROM.

- A main memory can be built of a single or many memory modules. A main memory module is built of an address decoder and a set of memory locations. The locations store words of bits of data assigned to consecutive addresses. The word can contain any, but fixed for a given computer, number of bits. There can be several word formats available in the same computer. Usually, the words are so defined as to contain an integer number of bytes.

- To store one bit of information, a bit cell is used in main memory. To read or write a word, an access has to be organized to a sequence of bit cells. The memory word length in contemporary computers can be a single byte or many bytes.

- Organization structures of main memories can be divided, according to the circuit that selects memory locations, into the following types:

Main memory with linear selection (with a single address decoder)
Main memory with two-dimensional selection (with two address decoders)
Main memory with linear selection of multiple words (with a single address decoder and a selector)

- Main memory with linear selection

Figure 3 : The principle of linear selection of memory locations

Figure 4 : Main memory module with linear selection

- Main memory with two-dimensional select

Figure 5 : The principle of two-dimensional selection of memory locations

Figure 6 : Memory module with a two-dimensional selection of memory locations

- Main memory with linear selection of multiple words

Figure 7 : Main memory with linear selection of multiple words

iii ) Associative memory

- Associative memory is found on a computer hard drive and used only in specific high-speed searching applications.

- Most computer memory known as random access memory, or RAM, works through the computer user providing a memory address and then the RAM will return whatever data is stored at that memory address.

- However, CAM works through the computer user providing a data word and then searching throughout the entire computer memory to see if the word is there. If the computer finds the data word then it offers a list of all of the storage addresses where the word was found for the user.

- CAM is faster than RAM in almost every search application, but many people stick with RAM for their computers because a computer with CAM is more expensive than RAM.

-The reason for the price increase for CAM computers is because with CAM computers, each cell has to have the full storage capability and logic circuits that can match content with external argument.

- Associative memory computers are best for users that require searches to take place quickly and whose searches are critical for job performance on the machine.

Figure 8

iv) Cache Memory

- A computer can have several different levels of cache memory.

- The level numbers refers to distance from CPU where Level 1 is the closest.

- All levels of cache memory are faster than RAM.

- The cache closest to CPU is always faster but generally costs more and stores less data then other level of cache.

Figure 9 : Different levels of Cache Memory.

Level 1 (L1) Cache

- It is also called primary or internal cache. It is built directly into the processor chip. It has small capacity from 8 Km to 128 Kb.

Level 2 (L2) Cache

- It is slower than L1 cache. Its storage capacity is more, i-e. From 64 Kb to 16 MB. The current processors contain advanced transfer cache on processor chip that is a type of L2 cache. The common size of this cache is from 512 kb to 8 Mb.

Level 3 (L3) Cache

- This cache is separate from processor chip on the motherboard. It exists on the computer that uses L2 advanced transfer cache. It is slower than L1 and L2 cache. The personal computer often has up to 8 MB of L3 cache.

v) Virtual Memory

- A cache stores a subset of the addresss space of RAM. An address space is the set of valid addresses.

- Thus, for each address in cache, there is a corresponding address in RAM. This subset of addresses (and corresponding copy of data) changes over time, based on the behavior of your program.

- Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed quickly.

- This is necessary because CPU speeds increase much faster than speed of memory access. If we could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because it can't keep up, we use cache.

- What if we wanted more RAM than we had available. For example, we might have 1 M of RAM, what if we wanted 10 M? How could we manage?

- One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10 Megs of disk space. At any time, only 1 Meg resides in RAM.

- In effect, RAM acts like cache for disk.

- This idea of extending memory is called virtual memory. It's called "virtual" only because it's not RAM. It doesn't mean it's fake.

- The real problem with disk is that it's really, really slow to access. If registers can be accessed in 1 nanosecond, and cache in 5 ns and RAM in about 100 ns, then disk is accessed in fractions of seconds.

- It can be a million times slower to access disk than a register.

- The advantage of disk is it's easy to get lots of disk space for a small cost.

Still, becaues disk is so slow to access, we want to avoid accessing disk unnecessarily.

Figure 10

iv ) Memory Management Hardware

- A memory management system is a collection of hardware and software procedures for managing various programs (effect of multiprogramming support) residing in memory.Basic components of memory management unit (MMU) are:

o A facility for dynamic storage relocation that maps logical memory references into physical memory addresses.

o A provision for sharing common programs by multiple users.

o Protection of information against unauthorized access.

- The dynamic storage relocation hardware is a mapping process similar to paging system.

- Segment:It is more convenient to divide programs and data into logical parts called segments despite/fixed-size pages. A segment is a set of logically related instructions or data elements. Segments may be generated by the programmer or by OS.

Examples are:a subroutine, an array of data, a table of symbols or user’s program.

- Logical Address:The address generated by the segmented program is called alogical address.This is similar to virtual address except that logical address space is associated with variable-length segments rather than fixed-length pages.

Segment Page Mapping

- The length of each segment is allowed to grow and contract according to the needs of the program being executed. One way of specifying the length of a segment is by associating with it a number of equal-sized pages.

Figure 11 : Logical to physical address mapping

Logical address = Segment+page+Word

- Where segment specifies segment number,page field specifies page within the segment and word field specifies specific word within the page.

- Here, mapping of logical address to physical address is done by using two tables:segment and page table. The entry in the segment table is a pointer address for the page table base,which is then added to page number(given in logical address).

- The sum point to some entry in page table and content of that page is the address of physical block. The concatenation of block field with the word field produces the final physical mapped address.

Parallel Procesing

05:02 CHAPTER 10 No comments

Introduction to Parallel Processing

- In computers, parallel processing is the processing of program instructions by dividing them among multiple processors with the objective of running a program in less time.

- In the earliest computers, only one program ran at a time. A computation-intensive program that took one hour to run and a tape copying program that took one hour to run would take a total of two hours to run.

- An early form of parallel processing allowed the interleaved execution of both programs together. The computer would start an I/O operation, and while it was waiting for the operation to complete, it would execute the processor-intensive program. The total execution time for the two jobs would be a little over one hour.

Parallel Processing Systems are designed to speed up the execution of programs by dividing the program into multiple fragments and processing these fragments simultaneously. Such systems are multiprocessor systems also known as tightly coupled systems. Parallel systems deal with the simultaneous use of multiple computer resources that can include a single computer with multiple processors, a number of computers connected by a network to form a parallel processing cluster or a combination of both.

- Parallel computing is an evolution of serial computing where the jobs are broken into discrete parts that can be executed concurrently. Each part is further broken down to a series of instructions. Instructions from each part execute simultaneously on different CPUs.

- Parallel systems are more difficult to program than computers with a single processor because the architecture of parallel computers varies accordingly and the processes of multiple CPUs must be coordinated and synchronized.

- The three models that are most commonly used in building parallel computers include synchronous processors each with its own memory, asynchronous processors each with its own memory and asynchronous processors with a common, shared memory.

- Flynn has classified the computer systems based on parallelism in the instructions and in the data streams. These are:

1. Single instruction stream, single data stream (SISD).

2. Single instruction stream, multiple data stream (SIMD).

3. Multiple instruction streams, single data stream (MISD).

4. Multiple instruction stream, multiple data stream (MIMD).

Why Use Parallel Computing?

Main Reasons:

· - Save time and/or money;.

· Solve larger problems: e.g.- Web search engines/databases processing millions of transactions per second.

· Provide concurrency.

Use of non-local resources:

Limits to serial computing:

o Transmission speeds - the speed of a serial computer is directly dependent upon how fast data can move through hardware, transmission limit of copper wire (9 cm/nanosecond).

o Limits to miniaturization.

o Economic limitations - it is increasingly expensive to make a single processor faster.

Current computer architectures are increasingly relying upon hardware level parallelism to improve performance:

· - Multiple execution units

· - Pipelined instructions

· - Multi-core

Multiple execution units

- In computer engineering, an execution unit (also called a functional unit) is a part of the central processing unit (CPU) that performs the operations and calculations as instructed by the computer program.

- It may have its own internal control sequence unit, which is not to be confused with the CPU's main control unit, some registers, and other internal units such as an arithmetic logic unit (ALU) or a floating-point unit (FPU), or some smaller and more specific components.

Instruction pipelining

- Instruction pipelining is a technique that implements a form of parallelism called instruction-level parallelism within a single processor.

- It therefore allows faster CPU throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate.

- The basic instruction cycle is broken up into a series called a pipeline. Rather than processing each instruction sequentially (finishing one instruction before starting the next), each instruction is split up into a sequence of steps so different steps can be executed in parallel and instructions can be processed concurrently (starting one instruction before finishing the previous one).

- Pipelining increases instruction throughput by performing multiple operations at the same time, but does not reduce instruction latency, which is the time to complete a single instruction from start to finish, as it still must go through all steps.

- Thus, pipelining increases throughput at the cost of latency, and is frequently used in CPUs but avoided in real-time systems, in which latency is a hard constraint.

- Each instruction is split into a sequence of dependent steps. The first step is always to fetch the instruction from memory; the final step is usually writing the results of the instruction to processor registers or to memory. Pipelining seeks to let the processor work on as many instructions as there are dependent steps, just as an assembly line builds many vehicles at once, rather than waiting until one vehicle has passed through the line before admitting the next one.

- The term pipeline is an analogy to the fact that there is fluid in each link of a pipeline, as each part of the processor is occupied with work.

Multi-core

- In consumer technologies, multi-core is usually the term used to describe two or more CPUs working together on the same chip.

- Also called multicore technology, it is a type of architecture where a single physical processor contains the core logic of two or more processors. These processors are packaged into a single integrated circuit (IC). These single integrated circuits are called a die.

- Multi-core can also refer to multiple dies packaged together. Multi-core enables the system to perform more tasks with a greater overall system performance.

- Multi-core technology can be used in desktops, mobile PCs, servers and workstations. Contrast with dual-core, a single chip containing two separate processors (execution cores) in the same IC.

Computer Organization & Architecture

Intorduction To Computer

Digital Logic

Digital Logic Simulator

MIPS(Microprocessor without Interlocked Pipeline Stages)

MIPS(Microprocessor without Interlocked Pipeline Stages) Simulator

Language of the Computers

The Processor

Memory Organization

Input and Ouput

Parallel Processing

Team Production

Thursday, 15 December 2016

INPUT AND OUTPUT

Tuesday, 13 December 2016

The Processor

Memory Organization

Parallel Procesing

About Me

Blog Archive

Recent

About Us

Random

Labels

Blog Archive

Blogger templates

Labels

Blogroll

About