Version 1.1 of the software fixes a bug that set the Edge bit in the PerfSel registers. This resulted in unexpectedly low event counts.
For this reason, AMD has included four performance monitoring counters in their Athlon series processors. The counters can be set to count either the number or duration of various events like the ones mentioned above. The counters can then be read with the RDPMC instruction. Unfortunately the four AMD performance monitoring counters are not compatible with the two in the comparable Intel chips. One of the more serious issues is that they recognize a substantially different set of events.
Setting the counters to monitor a certain event is accomplished by writing to four model specific registers (MSRs) that control the counters. This needs to be done in kernel mode, since otherwise the processor generates a General Protection Fault (GPF). For this purpose the device driver PerfMon.sys is present.
Do not install and start the device driver when the above requirements are not fulfilled. To enable Intel support you should recompile perfctrs.dll. The DLL is written in plain C and can be generated with MS VC++ 4.0 or better.
Second, a DLL provides read and write access to the CPU-specific hardware. This DLL can not work when the above device driver is not installed. This DLL is based on prior C++ work by Sami Sallinen.
Third, a small C program to test if the code is working correctly. The demonstration detects the CPU type and can set up the Athlon on-chip counters for four different events. Finally, it starts a user-specified program. When this user-supplied program stops, the performance monitor data is written to stdout. (In version 1.1 it is possible to specify the events that are monitored.)
Fourth, a Forth utility (in source) is supplied that allows access to the Performance Monitor Counters from any Forth
program or word. This is the intended use of all the software mentioned above. Here is the output of the utility
when used from within iForth 1.11e (the numbers are just for illustration):
#40175768 VALUE #istr \ sets up the loop count for about 1 second of execution time.
0 VALUE offs \ allows to see effect of non-aligned memory accesses.
CREATE ape 9 , 8 ,
: test TIMER-RESET #istr 0 DO ape offs + @+ XOR
7 OR 99 AND
7 OR 99 AND
7 OR 99 AND
7 OR 99 AND
7 OR 99 AND 33 + DROP
LOOP .ELAPSED ;
: DO-TEST ( offs -- ) TO offs ['] test TEST-PERFORMANCE ;
FORTH> CR .printID
VendorID: AuthenticAMD, MaxCPUID 1, type 0, family 6, model 4, stepping 2
FPU on-chip
Virtual-8086 mode enhancement
Debugging extensions
Page size extensions
Time-stamp-counter
Model Specific Register support
Physical Address Extensions
Machine Check Exception
CMPXCHG8B
Memory type range register
PTE Global bit
Machine check architecture
CMOV instruction
MMX(tm) Technology
FORTH> 0 do-test 1.000 seconds elapsed.
[00C5] retired_far_control_transfers : 626
[00C7] retired_resync_branches : 3
[00CE] ints_masked_while_pending : 0
[00CF] ints_taken : 193
[00CD] ints_masked : 33,844
[1F42] data_cache_refill_l2_all : 968
[1F43] data_mem_refs_all : 279
[1F44] data_cache_writebacks_all : 1,279
[0045] l1_dltb_misses_l2_dltb_hits : 337
[0046] l1+l2_dtlb_misses : 491
[0047] misaligned_data_references : 14
[0080] ifu_ifetch : 241,072,624
[0081] ifu_ifetch_miss : 398
[0084] l1_itlb_misses : 36
[0085] l1+l2_itlb_misses : 92
[00C0] retired_instructions : 1,004,401,986
[00C1] retired_ops : 1,124,990,805
[00C2] retired_branches : 40,179,116
[00C3] retired_mispredicted_branches : 784
[00C4] retired_taken_branch_mispredict : 40,177,633
[0040] data_cache_access : 401,821,336
[0041] data_cache_misses : 1,682
[1042] data_cache_refill_l2_modified : 350
[0842] data_cache_refill_l2_owner : 0
[0442] data_cache_refill_l2_exclusive : 92
[0242] data_cache_refill_l2_shared : 1
[0142] data_cache_refill_l2_invalid : 212
[1043] data_mem_refs_modified : 130
[0843] data_mem_refs_owner : 0
[0443] data_mem_refs_exclusive : 62
[0243] data_mem_refs_shared : 0
[0143] data_mem_refs_invalid : 0
[1044] data_cache_writebacks_modified : 510
[0844] data_cache_writebacks_owner : 0
[0444] data_cache_writebacks_exclusive : 223
[0244] data_cache_writebacks_shared : 0
When something goes wrong Win2K refuses to boot, or reboots spontaneously. In that case, start up in "safe mode" and delete perfmon.sys.