用小型的簡化的CPU作爲verilog的練習,一方面可以學習RISC_CPU的基本結構和原理,進而鞏固學習計算機體系結構;另一面可以學習並掌握一些常用的Verilog語法和驗證方法。
首先CPU運行的過程分爲以下幾個流程:1.取指令;2.分析指令;3.執行指令。進一步細化可以分爲:1.對指令進行譯碼;2.可以進行算術和邏輯運算;3.能與存儲器和外設交換數據;4.提供整個系統所需要的控制。對應地我們需要指令寄存器,指令譯碼器,算數邏輯運算部件(ALU),累加器,除此外,爲了讓整個系統run起來我們需要時序和控制部件,包括時鐘發生器,數據控制器,狀態控制器,程序計數器,地址多路器等。
1.時鐘信號發生器:
用外來的時鐘信號生成一系列的時鐘信號:fetch,alu_ena。並分別送往地址多路器和狀態控制器(fetch)以及算數邏輯運算單元(alu_ena)。fetch爲控制信號,是clk得到8分頻時鐘信號。當fetch爲爲高電平,觸發CPU開始執行一條指令。同時控制地址多路器和輸出指令地址和數據地址。clk信號則作用於指令寄存器、累加器、狀態控制器的時鐘信號。ALU_ENA則用於控制算術邏輯運算。該模塊程序如下:
input clk,reset;
output fetch,alu_ena;
wire clk,reset;
reg fetch,alu_ena;
reg[7:0] state;
parameter S1 = 8'b00000001,
S2 = 8'b00000010,
S3 = 8'b00000100,
S4 = 8'b00001000,
S5 = 8'b00010000,
S6 = 8'b00100000,
S7 = 8'b01000000,
S8 = 8'b10000000,
idle = 8'b00000000;
always@(posedge clk)
if(reset)
begin
fetch <= 0;
alu_ena <= 0;
state = idle;
end
else
begin
case(state)
S1:
begin
alu_ena <= 1;
state <= S2;
end
S2:
begin
alu_ena <= 0;
state <= S3;
end
S3:
begin
fetch <= 1;
state <= S4;
end
S4:
begin
state <= S5;
end
S5:
state <= S6;
S6:
state <= S7;
S7:
begin
fetch <= 0;
state <= S8;
end
S8:
begin
state <= S1;
end
idle:
state <= S1;
default:
state <= idle;
endcase
end
endmodule
2.指令寄存器
指令寄存器用於寄存指令,觸發時鐘爲clk,在clk正觸發正觸發沿下,寄存器將數據總線送來的指令存入高8位或低8位的寄存器中,寄存的時機由CPU狀態控制器的load_ir信號控制。每條指令分爲兩個字節,即16位,高3爲操作碼,低13位爲地址(CPU的總線爲13位,尋址空間爲8K字節)。本設計的總線爲8位,所以每條指令需要取兩次,爲了區別高8位和低8位,由變量state來記錄,該模塊的程序如下:
`timescale 1ns/1ns
module register(opc_iraddr,data,ena,clk,rst);
output [15:0] opc_iraddr;
input [7:0] data;
input ena,clk,rst;
reg [15:0] opc_iraddr;
reg state;
always@(posedge clk)
begin
if(rst)
begin
opc_iraddr <= 16'b0000_0000_0000_0000;
state <= 1'b0;
end
else
begin
if(ena)
begin
casex(state)
1'b0:
begin
opc_iraddr[15:8] <= data;
state <= 1;
end
1'b1:
begin
opc_iraddr[7:0] <= data;
state <= 0;
end
default:
begin
opc_iraddr[15:0] <= 16'bxxxx_xxxx_xxxx_xxxx;
state <= 1'bx;
end
endcase
end
else
state <= 1'b0;
end
end
endmodule
3.累加器
累加器用於存放當前計算的結果。當累加器通過ena收到來自CPU狀態控制器load_acc信號時,在時鐘正沿就收到來自數據總線的數據。該模塊的程序如下:
module accum(accum,data,ena,clk,rst);
output[7:0] accum;
input[7:0] data;
input ena,clk,rst;
reg[7:0] accum;
always@(posedge clk)
begin
if(rst)
accum <= 8'b0000_0000;
else
if(ena)
accum <= data;
end
endmodule
4.算術運算器
算術運算器根據8種不同的操作碼去分別實現加、與、異或、跳轉等基本操作運算,同時利用這幾種基本的運算可以實現很多其他的運算和邏輯判斷等操作。該模塊的程序如下:
`timescale 1ns/1ns
module alu (alu_out, zero, data, accum, alu_ena, opcode);
output [7:0] alu_out;
output zero;
input [7:0] data, accum;
input [2:0] opcode;
input alu_ena;
reg [7:0] alu_out;
parameter HLT = 3'b000,
SKZ = 3'b001,
ADD = 3'b010,
ANDD = 3'b011,
XORR = 3'b100,
LDA = 3'b101,
STO = 3'b110,
JMP = 3'b111;
assign zero = !accum;
always @(posedge alu_ena)
//if(alu_ena)
begin
casex (opcode)
HLT: alu_out <= accum;
SKZ: alu_out <= accum;
ADD: alu_out <= data + accum;
ANDD: alu_out <= data & accum;
XORR: alu_out <= data ^ accum;
LDA: alu_out <= data;
STO: alu_out <= accum;
JMP: alu_out <= accum;
default: alu_out <= 8'bxxxx_xxxx;
endcase
end
endmodule
5.數據控制器
數據控制器用於控制累加器的數據輸出,由於數據總線是各種操作時傳送數據的公共通道,不同情況傳送不同的內容,包括指令和數據。任何部件往總線上寫數據都需要控制信號來形成通和高阻態。累加器通過狀態控制器輸出的datactl_ena信號來控制輸出和高阻。該模塊的程序如下:
module datactl (data,in,data_ena);
output [7:0]data;
input [7:0]in;
input data_ena;
assign data = (data_ena)? in : 8'bzzzz_zzzz;
endmodule
6.地址多路器
地址多路器用於選擇輸出的地址是PC地址還是數據地址。每個指令的前4個週期從ROM中讀取指令,輸出的是PC的地址;後4個時鐘週期用於對RAM的讀寫,該地址由指令給出。地址選擇輸出信號是fetch:
`timescale 1ns/1ns
module adr(addr,fetch,ir_addr,pc_addr);
output [12:0] addr;
input [12:0] ir_addr, pc_addr;
input fetch;
assign addr = fetch? pc_addr : ir_addr;
endmodule
7.程序計數器
用於提供指令的地址,以便讀取指令。指令地址按順序存放在存儲器中。有兩種途徑可以形成指令的地址,1.順序執行的情況,2.遇到改變順序執行程序的情況。每次啓動時,CPU從ROM的零地址取指令並執行。每條指令執行的完成需要兩個時鐘,此時,pc_addr會增加2,從而指向下一條指令,如果時跳轉的指令則需要輸出load_pc信號通過load口進入程序計數器,程序計數器將pc_addr裝入目標地址(ir_addr),而不是增加2。該模塊的程序如下所示:
module counter(pc_addr,ir_addr,load,clock,rst);
output[12:0] pc_addr;
input[12:0] ir_addr;
input load,clock,rst;
reg[12:0] pc_addr;
always@(posedge clock or posedge rst)
begin
if(rst)
pc_addr <= 13'b0_0000_0000_0000;
else
if(load)
pc_addr <= ir_addr;
else
pc_addr <= pc_addr + 1;
end
endmodule
8.狀態控制器
狀態機控制器接收復位信號rst,當rst有效時,通過使得信號ena爲0,輸入到狀態機中,停止狀態機的工作。
`timescale 1ns/1ns
module machinectl(ena,fetch,rst,clk);
input fetch,rst,clk;
output ena;
reg ena;
reg state;
always@(posedge clk)
begin
if(rst)
begin
ena <= 0;
end
else
if(fetch)
begin
ena <= 1;
end
end
endmodule
9.狀態機
狀態機是CPU的控制核心,用於產生一系列的控制信號,啓動或者停止某些部件,狀態機可以控制CPU何時進行讀指令來讀寫I/O端口及RAM區等操作。狀態機中的當前狀態由state記錄,記錄了當前指令週期中已經經過的時鐘數。
指令的週期由8個時鐘週期組成,每個時鐘週期完成固定的操作:
1.第0個時鐘週期,CPU狀態控制器輸出的rd和l,load_ir爲高電平,其餘爲低電平。指令寄存器寄存由ROM送來的高8位指令代碼;
2.第1個時鐘,與上一個時鐘相比,inc_pc從0變成1,pc增1,ROM送來低8位指令代碼,指令寄存器寄存該8位代碼。
3.第2個時鐘,空操作;
4.第3個時鐘,PC增1,指向下一條指令。若操作符爲HLT,則輸出信號的HLT爲高。否則,除了PC增1外,其他的各個輸出控制線爲0.
5.第4個時鐘,若操作符爲AND,ADD,XOR,或LDA,讀相應地址的數據;若爲JMP,將目的地址送給程序計數器;若爲STO,輸出累加器數據。
6.第5個爲時鐘,若操作符AND,ADD,或XOR,算術運算器就進行相應的運算;若爲LDA,就把數據通過ALU送給累加器;若爲SKZ,先判斷累加器的值是否爲0,如果爲0,PC就加1。若爲JMP,鎖存目的地址;若爲STO,將數據寫入到地址處,
7.第6個時鐘,空操作;
8.第7個時鐘,若操作符爲SKZ,且累加器的值爲0,則PC值再加1,跳過一條指令,否則PC無變化。
該模塊的程序如下:
module machine( inc_pc, load_acc, load_pc, rd,wr, load_ir,
datactl_ena, halt, clk1, zero, ena, opcode );
output inc_pc, load_acc, load_pc, rd, wr, load_ir;
output datactl_ena, halt;
input clk1, zero, ena;
input [2:0] opcode;
reg inc_pc, load_acc, load_pc, rd, wr, load_ir;
reg datactl_ena, halt;
reg [2:0] state;
parameter HLT = 3 'b000,
SKZ = 3 'b001,
ADD = 3 'b010,
ANDD = 3 'b011,
XORR = 3 'b100,
LDA = 3 'b101,
STO = 3 'b110,
JMP = 3 'b111;
always @( negedge clk1 )
begin
if ( !ena )
begin
state<=3'b000;
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
ctl_cycle;
end
task ctl_cycle;
begin
casex(state)
3'b000: //load high 8bits in struction
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0001;
{wr,load_ir,datactl_ena,halt}<=4'b0100;
state<=3'b001;
end
3'b001: //pc increased by one then load low 8bits instruction
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1001;
{wr,load_ir,datactl_ena,halt}<=4'b0100;
state<=3'b010;
end
3'b010: //idle
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
state<=3'b011;
end
3'b011: //next instruction address setup 分析指令從這裏開始
begin
if(opcode==HLT) //指令爲暫停HLT
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1000;
{wr,load_ir,datactl_ena,halt}<=4'b0001;
end
else
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
state<=3'b100;
end
3'b100: //fetch oprand
begin
if(opcode==JMP)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0010;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
if( opcode==ADD || opcode==ANDD ||
opcode==XORR || opcode==LDA)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0001;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
if(opcode==STO)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0010;
end
else
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
state<=3'b101;
end
3'b101: //operation
begin
if ( opcode==ADD||opcode==ANDD||opcode==XORR||opcode==LDA )
begin //過一個時鐘後與累加器的內容進行運算
{inc_pc,load_acc,load_pc,rd}<=4'b0101;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
if( opcode==SKZ && zero==1)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
if(opcode==JMP)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1010;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
if(opcode==STO)
begin
//過一個時鐘後把wr變1就可寫到RAM中
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b1010;
end
else
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
state<=3'b110;
end
3'b110: //idle
begin
if ( opcode==STO )
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0010;
end
else
if ( opcode==ADD||opcode==ANDD||opcode==XORR||opcode==LDA)
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0001;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
state<=3'b111;
end
3'b111: //
begin
if( opcode==SKZ && zero==1 )
begin
{inc_pc,load_acc,load_pc,rd}<=4'b1000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
else
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
end
state<=3'b000;
end
default:
begin
{inc_pc,load_acc,load_pc,rd}<=4'b0000;
{wr,load_ir,datactl_ena,halt}<=4'b0000;
state<=3'b000;
end
endcase
end
endtask
endmodule
10.外圍模塊
10.1 地址譯碼器,用於產生選通信號
`timescale 1ns/1ns
module addr_decode( addr, rom_sel, ram_sel);
output rom_sel, ram_sel;
input [12:0] addr;
reg rom_sel, ram_sel;
always @( addr )
begin
casex(addr)
13'b1_1xxx_xxxx_xxxx:{rom_sel,ram_sel} <= 2'b01;
13'b0_xxxx_xxxx_xxxx:{rom_sel,ram_sel} <= 2'b10;
13'b1_0xxx_xxxx_xxxx:{rom_sel,ram_sel} <= 2'b10;
default:{rom_sel,ram_sel} <= 2'b00;
endcase
end
endmodule
10.2 ROM
module rom( data, addr, read, ena );
output [7:0] data;
input [12:0] addr;
input read, ena;
reg [7:0] memory [13'h1fff:0];
wire [7:0] data;
assign data= ( read && ena )? memory[addr] : 8'bzzzzzzzz;
endmodule
10.3 RAM
module ram( data, addr, ena, read, write );
inout [7:0] data;
input [9:0] addr;
input ena;
input read, write;
reg [7:0] ram [10'h3ff:0];
assign data = ( read && ena )? ram[addr] : 8'hzz;
always @(posedge write)
begin
ram[addr]<=data;
end
endmodule
最後將CPU內部的模塊連接起來:
`include "clk_gen.v"
`include "adder.v"
`include "addrmul.v"
`include "alu.v"
`include "cmd_fsm.v"
`include "PCCounter.v"
`include "fsmctl.v"
`include "register.v"
`include "datactl.v"
`timescale 1ns/1ns
module cpu(clk,reset,halt,rd,wr,data,opcode,fetch,ir_addr,pc_addr,addr);
input clk,reset;
output rd,wr,halt;
output[12:0] addr;
output[2:0] opcode;
output fetch;
output[12:0] ir_addr,pc_addr;
inout[7:0] data;
wire clk,reset,halt;
wire[7:0] data;
wire[12:0] addr;
wire rd,wr;
wire fetch,alu_ena;
wire [2:0] opcode;
wire [12:0] ir_addr,pc_addr;
wire [7:0] alu_out,accum;
wire zero,inc_pc,load_acc,load_pc,load_ir,data_ena,contr_ena;
clk_gen m_clk_gen (.clk(clk), .reset(reset), .fetch(fetch), .alu_ena(alu_ena));
register m_register (.data(data), .ena(load_ir), .rst(reset), .clk(clk),
.opc_iraddr({opcode,ir_addr}));
accum m_accum (.data(alu_out), .ena(load_acc), .clk(clk), .rst(reset),
.accum(accum));
alu m_alu (.data(data), .accum(accum), .alu_ena(alu_ena),
.opcode(opcode), .alu_out(alu_out), .zero(zero));
machinectl m_machinectl (.clk(clk), .rst(reset), .fetch(fetch), .ena(control_ena));
machine m_machine (.inc_pc(inc_pc), .load_acc(load_acc), .load_pc(load_pc),
.rd(rd), .wr(wr), .load_ir(load_ir), .clk1(clk), .datactl_ena(data_ena), .halt(halt),
.zero(zero), .ena(contr_ena), .opcode(opcode));
datactl m_datactl (.in(alu_out), .data_ena(data_ena), .data(data));
adr m_adr (.fetch(fetch), .ir_addr(ir_addr), .pc_addr(pc_addr), .addr(addr));
counter m_counter (.clock(inc_pc), .rst(reset), .ir_addr(ir_addr), .load(load_pc), .pc_addr(pc_addr));
endmodule
將外圍模塊也進行了連接
`include "cpu.v"
`include "ram.v"
`include "rom.v"
`include "addr_decode.v"
`timescale 1ns/1ns
`define PERIOD 100
module cputop;
reg reset_req,clock;
integer test;
reg [(3*8):0] mnemonic;
reg [12:0] PC_addr, IR_addr;
wire [7:0] data;
wire [12:0] addr;
wire rd,wr,halt,ram_sel,rom_sel;
wire [2:0] opcode;
wire fetch;
wire [12:0] ir_addr,pc_addr;
cpu t_cpu (.clk(clock), .reset(reset_req), .halt(halt), .rd(rd),
.wr(wr), .addr(addr), .data(data), .opcode(opcode),
.fetch(fetch), .ir_addr(ir_addr), .pc_addr(pc_addr));
ram t_ram (.addr(addr[9:0]), .read(rd), .write(wr), .ena(ram_sel), .data(data));
rom t_rom (.addr(addr), .read(rd), .ena(rom_sel), .data(data));
addr_decode t_addr_decode (.addr(addr), .ram_sel(ram_sel), .rom_sel(rom_sel));
endmodule
經過VCS綜合得到電路: