NeXTdimension ROM image

Started by andreas_g, June 19, 2015, 02:50:54 AM

Previous topic - Next topic

andreas_g

I don't think that it is neccessary to do anything on the hardware side. It is definitely not worth to risk the health of the board. I'm quite sure there is a way to obtain the data from the software side.

Furthermore it might turn out to be too difficult or time consuming to emulate. It does no look too good at the moment.

M Paquette

Quote from: "andreas_g"
Using the low level emulation approach we should not need to care about all these strange features and data formats. If there is somewhere a complete i860 emulator, it will do all these things for us. We only need to provide it the memory map including a valid ROM and the devices, of which are not too many on the NeXTdimension board. The i860 will then do its work on the data it is provided with and write the result to the video memory. All we need to do then is blitting the pixel data from the video memory to the host screen.

You should be aware that the only reason the i860 was faster at 32 bit graphics than other solutions at the time was because of certain hardware advantages that will not appear in emulation.  With a full emulation of both the i860 and 68040, complete with emulation of the communications overhead, and the lack of hardware speedups such as async multiprocessing, the dedicated i860 FPU pixel processor (used for all drawing, compositing, and 3D code) and a true deep pipelined dual instruction issue engine, I calculate that the level of performance for graphics will be around 12-20% of the current Previous emulation of the NeXTStation Color.

More on the i860 core here: http://www.realworldtech.com/intel-history-lesson/2/

As I mentioned before, you'll get the biggest improvement for your efforts by implementing mechanisms in the emulator to spot blit loops and having the emulator transfer control to a full native implementation.  Something similar is done in commercial emulation software.  For example, the old SoftPC software would spot calls to the VGA/EGA graphics routines and rather than stepping through the VGA code, would call routines on the native platform to perform the same task, often using GPU acceleration where available.  (Guess who wrote that?)

andreas_g

I'm aware that emulating the NeXTdimension won't speed up, but slow down Previous. It is not a goal of Previous to provide maximum speed. Emulation speed is intentionally limited to a realistic value, rather than maximum speed. So there is some kind of "reserve". At the moment i see no need to improve efficiency, except for the DSP emulation.

Note: The timings are not accurate at the moment. Therefore the speed of the emulated machine does not match a real machine. Anyway, if the host system is fast enough, the speed of the emulated machine is independant of host performance.

The only reason emulating the NeXTdimension would be having 32-bit color and because we can   :wink:

To have fast enough speed it would be neccessary to run the emulated i860 in a separate thread.

At the moment screen drawing and timings work like this: After a defined number of emulated CPU cycles a routine is called, which draws the contents of the emulated VRAM to the host screen, skipping unchanged pixels. Also this routine checks if a defined value of real time has passed since the last call of the routine. If not, it waits until the time has passed an then continues emulation. This limits speed to a defined value.

M Paquette

Quote from: "andreas_g"
The only reason emulating the NeXTdimension would be having 32-bit color and because we can   :wink:

That's what I thought.  It might make a nice personal goal for you.

Quote from: "andreas_g"To have fast enough speed it would be neccessary to run the emulated i860 in a separate thread.

Heh.  Just remember, new threads don't necessarily create new CPU cycles.   Once all the processor cores are occupied, that's about it.

Quote from: "andreas_g"
At the moment screen drawing and timings work like this: After a defined number of emulated CPU cycles a routine is called, which draws the contents of the emulated VRAM to the host screen, skipping unchanged pixels. Also this routine checks if a defined value of real time has passed since the last call of the routine. If not, it waits until the time has passed an then continues emulation. This limits speed to a defined value.

Idle thought:  Watch for cache flushes that include VRAM addresses.  NeXTStation Color and the NeXTdimension board run VRAM as cacheable by the local processor to enable burst or pipelined framebuffer reads and writes.  DPS or it's ND board back end will flush the caches at display time.

andreas_g

Quote from: "M Paquette"Idle thought:  Watch for cache flushes that include VRAM addresses.  NeXTStation Color and the NeXTdimension board run VRAM as cacheable by the local processor to enable burst or pipelined framebuffer reads and writes.  DPS or it's ND board back end will flush the caches at display time.
In real hardware screen refreshes are not triggered by the CPU. At the moment screen drawing works like it does on real hardware. The screen is refreshed at a defined interval using the data that is inside the VRAM at that time. I see no need to change this.

It seems I can't get a raw ROM and I did not succeed contacting Jason Eckhardt. So I'll stop here trying to reach my personal goal.

btw.
Of course I know that threads do not create CPU cycles. But processing the i860 in the same thread as the m68k will cause one CPU core to be overwhelmed, while others might be idle. I have this exact problem at the moment while running the DSP.

barcher174

I think I can get the ROM dumped for you. It probably won't be until the end of July though.

--
Brian

andreas_g

barcher174, thank you for your efforts! But it will only make sense, if we can get a usable i860 processor core. So at the moment it is not worth the trouble. But if things change, i'll report back.

cbrunschen

Quote from: "andreas_g"barcher174, thank you for your efforts! But it will only make sense, if we can get a usable i860 processor core. So at the moment it is not worth the trouble. But if things change, i'll report back.

There seems to be an i860 core in MAME, and it seems to be under the BSD 3-clause license – would that be a possible starting point?

andreas_g

Quote from: "cbrunschen"There seems to be an i860 core in MAME, and it seems to be under the BSD 3-clause license – would that be a possible starting point?
I did notice that. I mentioned it some posts before. The problem with that core is, that it was obviously stripped down for MAME. The full version seems to be lost. I tried contacting the coder of the original code. But I did not get a response yet.

cbrunschen

Quote from: "andreas_g"
Quote from: "cbrunschen"There seems to be an i860 core in MAME, and it seems to be under the BSD 3-clause license – would that be a possible starting point?
I did notice that. I mentioned it some posts before.

So you did, I had missed that; my apologies.

QuoteThe problem with that core is, that it was obviously stripped down for MAME. The full version seems to be lost. I tried contacting the coder of the original code. But I did not get a response yet.

Looking at the MAME code it lists:


MAME-specific notes:
- i860XR emulation only (i860XP unnecessary for MAME).
- No emulation of data and instruction caches (unnecessary for MAME version).
- No emulation of DIM mode or CS8 mode (unnecessary for MAME version).
- No BL/IL/locked sequences (unnecessary for MAME).
- Emulate only the i860's LSB-first mode (BE = 0).


If I see correctly from pictures of a NeXTDimension board, it holds an i860XR at 33 MHz (compare this picture); so that should cover the top item.

I am going to guess that the precise cache timing is probably also not vital for emulation of a NeXTDimension board.

Comparing the remaining items with the programmers manual:

CS8 is a mode that allows booting form 8-bit-wide memory before switching to 64-bit-wide memory for subsequent operation; this sounds like it might be used on a NeXTDimension (if the boot (P)ROM is 8 bits wide)

DIM seems to be "Dual Instruction Mode" where the CPU executes one "core" and one floating-point operation at the same time, in lock-step; that sounds like it would likely be very useful in the kind of code running here; but it might be that simply executing the instructions one after the other (interleaved rather than in parallel) would still produce the same results, and only timing would be affected.

BL/IL/locked sequences seem to be about locking a bus that may be shared among multiple CPUs; this may not be necessary for NeXTDimension emulation.

Endianness – while it might make sense to use big-endianness for data when the main CPU in the system (680x0) is big-endian, the i860 is natively little-endian, and it might be easiest just to have left it as0-is.

It seems that a lot of these things could be checked for by looking at the code in the ROM: seeing if it includes an instruction to disable CS8 mode; if any instructions have the 'Dual-instruction' bit set; whether any instruction sets the BE bit in the control register; whether there are any lock/unlock instructions. That could then be used to determine what would be necessary to add to the i860 emulation code, if anything.

// Christian

M Paquette

Quote from: "cbrunschen"
Looking at the MAME code it lists:


MAME-specific notes:
- i860XR emulation only (i860XP unnecessary for MAME).
- No emulation of data and instruction caches (unnecessary for MAME version).
- No emulation of DIM mode or CS8 mode (unnecessary for MAME version).
- No BL/IL/locked sequences (unnecessary for MAME).
- Emulate only the i860's LSB-first mode (BE = 0).


If I see correctly from pictures of a NeXTDimension board, it holds an i860XR at 33 MHz (compare this picture); so that should cover the top item.

I am going to guess that the precise cache timing is probably also not vital for emulation of a NeXTDimension board.

Comparing the remaining items with the programmers manual:

CS8 is a mode that allows booting form 8-bit-wide memory before switching to 64-bit-wide memory for subsequent operation; this sounds like it might be used on a NeXTDimension (if the boot (P)ROM is 8 bits wide)

Correct.  The i860 is in CS8 mode on reset and executing from the one byte wide EEPROM.  (I posted the 'swizzling' code earlier that shows how the 64 bit aligned instruction stream is repacked for use in the ROM, along with how it is repacked for use in a ROM programmer which does it's own thing in programming a byte wide device.)  We don't put data in the boot ROM other than as the IMMEDIATE field of instructions.  Data tables needed are built from instructions in the ROM.

Quote
DIM seems to be "Dual Instruction Mode" where the CPU executes one "core" and one floating-point operation at the same time, in lock-step; that sounds like it would likely be very useful in the kind of code running here; but it might be that simply executing the instructions one after the other (interleaved rather than in parallel) would still produce the same results, and only timing would be affected.
DIM is indeed dual instruction mode, a mode in which two opcodes are issued at a time, one to the integer unit and one to the FPU/GPU unit.  Note however that the issued opcodes are in an explicit pipeline, with exposed sequencing including delay slots.  The delay slots are even used when DIM is off.

Look out for sequences like this:

xor 4,r31,r31 // Flip addr to big-endian for fetch...
br 2f
   ld.l 0(r31),r17


That load to r17 is in the branch delay slot.  It executes when the branch is taken.

DIM also impacts how the VM system runs.  Page fault code, for example, has to check BOTH instructions in a pair.

Pretty much all of the window server compositing code is in hand-pipelined DIM code
Quote
BL/IL/locked sequences seem to be about locking a bus that may be shared among multiple CPUs; this may not be necessary for NeXTDimension emulation.

Endianness – while it might make sense to use big-endianness for data when the main CPU in the system (680x0) is big-endian, the i860 is natively little-endian, and it might be easiest just to have left it as0-is.

If you plan on having the i860 emulation work with the NeXTSTEP/OPENSTEP software, it should be big-endian.  The window server and Mach kernel communications paths all assume a big-endian data machine.  If I recall, NeXT was the only i860 customer to use the big-endian mode.

Quote
It seems that a lot of these things could be checked for by looking at the code in the ROM: seeing if it includes an instruction to disable CS8 mode; if any instructions have the 'Dual-instruction' bit set; whether any instruction sets the BE bit in the control register; whether there are any lock/unlock instructions. That could then be used to determine what would be necessary to add to the i860 emulation code, if anything.

The ROM code drops CS8 as part of the jump to a loaded program in ND main memory.
The ROM code definitely flips on the BE bit in the EPSR.
The i860 LOCK/UNLOCK opcodes are NOT supported by the ND bus or NBIC interface.  Interprocessor locking is done with Lamport's algorithm[1].



1. Leslie Lamport, "A Fast Mutual Exclusion Algorithm", ACM TOCS, Vol. 5-1, February 1987, pp. 1-11

cuby

Quote from: "M Paquette"The i860 LOCK/UNLOCK opcodes are NOT supported by the ND bus or NBIC interface.  Interprocessor locking is done with Lamport's algorithm[1].

1. Leslie Lamport, "A Fast Mutual Exclusion Algorithm", ACM TOCS, Vol. 5-1, February 1987, pp. 1-11
Whow, this is a great example for a quick transfer of knowledge from academia to industry, considering the NeXTdimension came out in 1990 and development of the software must have started some years earlier.

(Sorry for the off-topic, but since I'm currently teaching computer architecture and parallel systems, this immediately caught my eye...)

-- Michael

andreas_g

I think it would require quite a lot of work to make the i860 code of MAME work for the NeXTdimension. It would requite more time, that I have available.
At the moment it is even uncertain, if the full version of the i860 emulator would be good enough. But that can only be tested, if it can be found.

I decided to concentrate now on adding support for Turbo machines. That is quite a lot easier and brings the advantage of having 128 MB memory. Nevertheless it will require some time to make everything work (Ethernet and DSP are broken, 3.1 and later crashes on boot).

andreas_g

This does not entirely fit the initial topic, but as this thread is mostly about emulation a dimension board i'll post it here:

On the internet there are some warnings not to use the monochrome screen on a system with NeXTdimension and dual monitors:
Quote from: "http://www.ding.net/info/next/next1.html"Subject: L5. Why is my machine so slow when I run the monochrome and NeXTdimension displays?
There is a bug with the window system in which if you select the monochrome display as your primary display the server will be much much slower. The solution for those wishing to use both displays is to select the color (NeXTdimension) display as the primary display. The most optimal configuration at present with the NeXTdimension is to run only the color display.
Quote from: "https://ftp.nice.ch/peanuts/GeneralData/Usenet/news/1991/_CSN-91/comp-sys-next/1991/Jul/_On-2-headed-Cube,-Color-Screen-MUST-have-loginwindow-dock.html"Date: Sun 05-Jul-1991 07:59:37
From: [email protected] (Izumi Ohzawa)
Subject: On 2-headed Cube, Color Screen MUST have loginwindow/dock

This subject came up under another title, but it seems
important enough to  post a summary now that NeXT is
shipping NeXTdimension boards in quantity.

If you configure your NeXTdimension cube as a two-headed
system (with both color and monochrome displays), you
MUST use the NeXTdimension (color) screen as the primary
screen (screen with loginwindow and your Dock).

The other configuration, loginwindow on the Monochrome screen,
suffers from a severe performance loss on the order of a
factor of 10 or more!
The system also eats up the swap space like crazy.
Some applications, e.g., /NextDeveloper/Demos/VideoApp,
will also refuse to start up in this configuration.

The doc "Getting Started with NeXTdimension" that comes
with the board says that you can choose the primary
screen (also called "zero screen") to be either NeXTdimension
or Monochrome by dragging a tiny loginwindow icon between
screens in the Preferences application.
Don't select monochrome as the primary screen, unless you want
your system to run at snail's pace.

I don't know why there is this asymmetry of performance
between the two configurations.
Some explanation from NeXT or a NeXT person would be nice.


Izumi Ohzawa

It was suspected, that in this configuration all the processing is done on the 68k and the NeXTdimension is only used as a framebuffer. Does anyone (M Paquette) know if that is true?

M Paquette

Quote from: "andreas_g"This does not entirely fit the initial topic, but as this thread is mostly about emulation a dimension board i'll post it here:

On the internet there are some warnings not to use the monochrome screen on a system with NeXTdimension and dual monitors:
Quote from: "http://www.ding.net/info/next/next1.html"Subject: L5. Why is my machine so slow when I run the monochrome and NeXTdimension displays?
There is a bug with the window system in which if you select the monochrome display as your primary display the server will be much much slower. The solution for those wishing to use both displays is to select the color (NeXTdimension) display as the primary display. The most optimal configuration at present with the NeXTdimension is to run only the color display.
Quote from: "https://ftp.nice.ch/peanuts/GeneralData/Usenet/news/1991/_CSN-91/comp-sys-next/1991/Jul/_On-2-headed-Cube,-Color-Screen-MUST-have-loginwindow-dock.html"Date: Sun 05-Jul-1991 07:59:37
From: [email protected] (Izumi Ohzawa)
Subject: On 2-headed Cube, Color Screen MUST have loginwindow/dock

This subject came up under another title, but it seems
important enough to  post a summary now that NeXT is
shipping NeXTdimension boards in quantity.

If you configure your NeXTdimension cube as a two-headed
system (with both color and monochrome displays), you
MUST use the NeXTdimension (color) screen as the primary
screen (screen with loginwindow and your Dock).

The other configuration, loginwindow on the Monochrome screen,
suffers from a severe performance loss on the order of a
factor of 10 or more!
The system also eats up the swap space like crazy.
Some applications, e.g., /NextDeveloper/Demos/VideoApp,
will also refuse to start up in this configuration.

The doc "Getting Started with NeXTdimension" that comes
with the board says that you can choose the primary
screen (also called "zero screen") to be either NeXTdimension
or Monochrome by dragging a tiny loginwindow icon between
screens in the Preferences application.
Don't select monochrome as the primary screen, unless you want
your system to run at snail's pace.

I don't know why there is this asymmetry of performance
between the two configurations.
Some explanation from NeXT or a NeXT person would be nice.


Izumi Ohzawa

It was suspected, that in this configuration all the processing is done on the 68k and the NeXTdimension is only used as a framebuffer. Does anyone (M Paquette) know if that is true?

The screen with the loginwindow is treated as the 'main' screen, the one where the user intends to do most work.  The NeXT AppKit uses the main screen as the designated place to cache all of it's off-screen artwork, images, and UI elements.

If the 'main' screen, the one with loginwindow, is on the NeXTdimension card then that card holds the cache, and display of UI elements, artwork, and images is quite fast.  If the 2 bit MegaPixel display is the main screen, then all the cache contents are kept with the 2 bit display in main memory and swap, and every drawing operation referencing these elements on the ND board has to copy them over the NeXTbus.  That can be relatively slow.