I made such a game engine long ago, when I had my first 80386SX, at 25 MHz. With a VGA card. It had many layers and parallax scrolling. I just copied everything to the video buffer, starting with the ones furthest back. On top of each other. That is by far the fastest way to do it.
Ok, I did make an efficient loop in assembler that did the actual checking for transparency and copying. Everything else was Borland Pascal objects (linked lists).
Even with many layers and animated sprites, it always ran at 60 fps (vsync).
And I blew up a monitor by experimenting with custom video modes.