News

Bloodborne progress

by admin in

As already probably figured out from the posts around , we continue to improve slowly but stable the bloodborne game. In this new video posted below we have Avplayer working (Avplayer is streaming library that plays mp4 files on ps4) , and some graphics improvements .

Still we are far from making this game playable (although Dark Souls that uses similar engine already has some ingame graphics) but we progressing slowly to give you more every day.

Stay tuned for more updates soon.

Developing

Implemented guest buffer manager

by admin in

What problem does this solve

Up until this point main branch used a host memory stream buffer for making all resources accessible to the GPU. This is because AMD hardware has very small alignment requirements for both uniform and storage buffers (only 4 bytes) while nvidia has 64/16 respectively. The device local-host visible part of memory is also too small for most systems to serve this purpose, and we probably need it for better things.

This adds a ton of overhead to GPU emulation as all buffer bindings need to memcpy a (sometime large) chunk of memory to the stream buffer. It was also slow as GPU is not using VRAM for fast access. It also didn’t work for storage buffers where the writes would be lost, as there was no way of preserving them in the volatile buffer.

This PR aims to solve most of the these issues by keeping a GPU side mirror of guest address space for the GPU to access. It uses write protection to track any modifications and will re-sync the ranges on demand when needed. It’s still a bit incomplete though in ways I will cover more below. Seems to fix flicker on RDR with AMD gpus, but it still persists on NVIDIA which needs more investigation.

Basic design

In terms of operations the most important is searching for buffers as this is done multiple times per draw. Tracking page dirtiness must be fast as well. Insertion/deletion of buffers should also be fast but this happens more rarely than the other operations.

For page tracking we employ a bit-based tracker with 4KB granularity, same as the host page size we target. It works on 2 levels; each WordManager is responsible for tracking 4MB of virtual address space and is created on demand when a particular region is invalidated. The MemoryTracker will iterate each manager that touches the region and gather all dirty ranges from each one. All this uses bit operations and avoids heap allocations, so it’s quite fast compared to an interval set.

For the buffer cache, we cache buffers with host page size granularity. This makes things easier as we can avoid having to manage buffer overlaps; each page is exclusively owned by a buffer at a time. Buffers are stored in a multi level page table that covers (most) of the virtual address space and has comparable performance to a flat array access, but also using far less memory in the process. While at it, I’ve also switched the texture cache to use the same page table, as it should be faster than the existing hash map.

Every time we fetch a buffer, we check if the region is CPU dirty and build a list of copies needed to validate the buffer from CPU data. The data is copied to a staging buffer and the buffer is validated. I’ve added a small optimization in this area, specially for small uniform buffers whose page has not been gpu modified. For those we can skip cached path and directly copy data into device local-host visible stream buffer to avoid a potential renderpass break in games that update uniforms often. Buffer upload reordering is also a potential future optimization, but that will matter more on tiled GPUs I imagine.

GPU modification tracking is partially implemented. The switch to cached buffer objects also raises the issue of alignment again. An easy solution would be to force SSBOs in most cases but still has cases where alignment of 16 is not satisfied. Switching to device buffer address is also possible, but would probably result in performance degradation on NVIDIA hardware, as it has fixed function binding points for UBOs/SSBOs and hardware probably prefers you use them.
So on each buffer bind we check the offset and align it down if necessary, adding the offset into a push constant block that gets added into every buffer access. This results in a bit more overhead during each buffer access but I believe its the simplest approach at the moment without sacrificing much performance.

Some notes on potential expansions

The current design should work on any modern GPU. However could also take advantage of ReBar here in many ways. The simplest way is allocating all buffers in device local-host visible memory and perform as many updates inline as possible.
A more advanced way to make use of it would be a Linux only technique that also uses the extremely new extension
VK_EXT_map_memory_placed. This allows us to tell vkMapMemory the exact virtual address to map our buffer. So we can use this to map GPU memory, directly into our virtual address space, avoiding the need of page dirty tracking almost entirely.

The cache also makes no attempt to preserve GPU modified memory regions when a CPU write occurs to unrelated part of the same page. This means that the next time the buffer is used, part or all of the buffer will get trashed by CPU memory. This is a complex problem to solve as guest gives us little indication of when it wants to sync so it is left for later.

2
News

Kickbeat special edition

by admin in

Today we have kickbeat special edition . It appears to be mostly playable , some flickering issues appears but you can play the game in normal speed. A new release will come soon so stay tuned for more updates

News

Shenmue II ingame

by admin in

Shemnue II appears to work pretty well on our latest 0.1.1 WIP version . Stay tuned for shadps4 release . It will come soon 🙂

News

Dark Souls Continue…

by admin in

With some fake imedialog input and some gpu fixes dark souls can now get pass character screen and plays some fmv videos. Still not ingame die to some missing gpu issues but we are getting close 🙂

https://youtu.be/HqNUtYBm1Wg
News

Dark SouLS Remaster

by admin in

Today we have Dark souls remaster reaching character selection screen , similar to bloodborne . Stay tuned for more exciting new soon.

Bloodborne colors progress

by admin in

A new contributor called “Roamic” fixed some issues in bloodborne so it is not red any more. Game still doesn’t progress further but progress is progress 🙂

9
News

BlooDBorne

by admin in

Today we have a big surprise for y all . Shadps4 is the first ps4 emu that gets to character screen on bloodborne game. Wait.. it is still far from playable but a start has been made 🙂

Releases

shadps4 v0.1.0 released

by admin in

A brand new version of shadps4 has just been landed. This version is the first one that show life to a bunch of commercial games (Simple ones mostly). Also it’s the first release with linux support.

Grab your copy on dowloads section.

You can also check (or maybe submit) the compatibility list from menu above