The big contributors to the huge file sizes we have today are:
1. High variety of detailed textures
2. Ability to include high-detail character and level geometry
3. Uncompressed audio, which maximizes audio fidelity while reducing CPU burden
4. High-detail FMV.
Much of this is due to current hardware just being more capable and being able to store and present more data than previous consoles. When 3D games are being made, the base assets are often much higher detail than what actually makes it into the final game. Newer hardware allows more ambitious things to be planned and allows less compression to be necessary. We're still largely on the same path with these trends as we were in the PS2 and PS3 days. More detail:
1. The high level of storage available enables developers to have a high variety of high-detail textures, which is easy and effective to take advantage of. Once a base texture is created, it is easy to create a separate variant on it to be used in a special situation. Also, most textures you see in newer games aren't actually made up of just one texture file; they are made up of several layers of textures put on top of one another.
2. Many important characters and objects are originally created at a detail level that is far higher than what ends up in the final game.Increased storage, memory, and CPU capability enables that work to be compressed less and keep more of the original detail. Also, developers can plan for environments that have far more detail.
3. Uncompressed sound is used a lot now because they have the storage space, because it is a higher quality end result, and because uncompressed audio saves the CPU from needing to uncompress compressed files on the fly. Current game consoles have plenty of memory bandwidth but the CPUs are a little lacking, so any effort that frees up the CPU can lead to things like larger levels, more systems, more detailed physics, or more enemies. Also, it's much easier for indie or small titles to just make their music and put it in an MP3 file instead of coding it into MIDI, which would take up far less space.
4. While games are using FMV less and less, some games still use it to cover up loading times for story scenes where something happens that the game engine can't handle natively, such as switching between locations quickly or for a big climactic event. To not be jarring, the file quality has to be really high in order to not be distracting, so if FMV is used today, even a couple of minutes of it will get into the gigabyte range.