Up to date

This page is up to date for Godot 4.2. If you still find outdated information, please open an issue.

Dicas gerais de otimização¶

Introdução¶

In an ideal world, computers would run at infinite speed. The only limit to what we could achieve would be our imagination. However, in the real world, it's all too easy to produce software that will bring even the fastest computer to its knees.

Thus, designing games and other software is a compromise between what we would like to be possible, and what we can realistically achieve while maintaining good performance.

Para obter os melhores resultados, temos duas abordagens:

Trabalhe mais rápido.
Trabalhe de forma mais inteligente.

E, de preferência, usaremos uma mistura das duas.

Fumaça e espelhos¶

Part of working smarter is recognizing that, in games, we can often get the player to believe they're in a world that is far more complex, interactive, and graphically exciting than it really is. A good programmer is a magician, and should strive to learn the tricks of the trade while trying to invent new ones.

The nature of slowness¶

To the outside observer, performance problems are often lumped together. But in reality, there are several different kinds of performance problems:

A slow process that occurs every frame, leading to a continuously low frame rate.
An intermittent process that causes "spikes" of slowness, leading to stalls.
A slow process that occurs outside of normal gameplay, for instance, when loading a level.

Each of these are annoying to the user, but in different ways.

Medindo o desempenho¶

Probably the most important tool for optimization is the ability to measure performance - to identify where bottlenecks are, and to measure the success of our attempts to speed them up.

There are several methods of measuring performance, including:

Putting a start/stop timer around code of interest.
Using the Godot profiler.
Using external CPU profilers.
Using external GPU profilers/debuggers such as NVIDIA Nsight Graphics, Radeon GPU Profiler or Intel Graphics Performance Analyzers.
Checking the frame rate (with V-Sync disabled). Third-party utilities such as RivaTuner Statistics Server (Windows) or MangoHud (Linux) can also be useful here.
Using an unofficial debug menu add-on.

Be very aware that the relative performance of different areas can vary on different hardware. It's often a good idea to measure timings on more than one device. This is especially the case if you're targeting mobile devices.

Limitações¶

CPU profilers are often the go-to method for measuring performance. However, they don't always tell the whole story.

Bottlenecks are often on the GPU, "as a result" of instructions given by the CPU.
Spikes can occur in the operating system processes (outside of Godot) "as a result" of instructions used in Godot (for example, dynamic memory allocation).
You may not always be able to profile specific devices like a mobile phone due to the initial setup required.
You may have to solve performance problems that occur on hardware you don't have access to.

As a result of these limitations, you often need to use detective work to find out where bottlenecks are.

Trabalho de detetive¶

Detective work is a crucial skill for developers (both in terms of performance, and also in terms of bug fixing). This can include hypothesis testing, and binary search.

Teste de hipóteses¶

Say, for example, that you believe sprites are slowing down your game. You can test this hypothesis by:

Measuring the performance when you add more sprites, or take some away.

This may lead to a further hypothesis: does the size of the sprite determine the performance drop?

You can test this by keeping everything the same, but changing the sprite size, and measuring performance.

Pesquisa binária¶

If you know that frames are taking much longer than they should, but you're not sure where the bottleneck lies. You could begin by commenting out approximately half the routines that occur on a normal frame. Has the performance improved more or less than expected?

Once you know which of the two halves contains the bottleneck, you can repeat this process until you've pinned down the problematic area.

Profilador¶

Profilers allow you to time your program while running it. Profilers then provide results telling you what percentage of time was spent in different functions and areas, and how often functions were called.

This can be very useful both to identify bottlenecks and to measure the results of your improvements. Sometimes, attempts to improve performance can backfire and lead to slower performance. Always use profiling and timing to guide your efforts.

For more info about using Godot's built-in profiler, see The Profiler.

Princípios¶

Donald Knuth disse:

Os programadores perdem muito tempo pensando ou se preocupando com a velocidade de partes não críticas de seus programas, e essas tentativas de eficiência, na verdade, têm um forte impacto negativo quando a depuração e a manutenção são consideradas. Devemos esquecer as pequenas eficiências, digamos, em cerca de 97% das vezes: a otimização prematura é a raiz de todos os males. No entanto, não devemos deixar passar nossas oportunidades nesses 3% críticos.

As mensagens são muito importantes:

O tempo do desenvolvedor é limitado. Em vez de tentar cegamente acelerar todos os aspectos de um programa, devemos concentrar nossos esforços nos aspectos que realmente importam.
Os esforços de otimização geralmente acabam com código mais difícil de ler e depurar do que o código não otimizado. É do nosso interesse limitar esta situação a áreas que irão realmente beneficiar.

Só porque nós podemos otimizar um determinado pedaço de código, isso não significa necessariamente que devemos. Saber quando e quando não otimizar é uma grande habilidade a ser desenvolvida.

Um aspecto enganoso da citação é que as pessoas tendem a se concentrar na subcitação "otimização prematura é a raiz de todo o mal". Enquanto a otimização prematura é (por definição) indesejável, o software de desempenho é o resultado de um design eficiente.

Design de desempenho¶

O perigo de encorajar as pessoas a ignorar a otimização até que seja necessário, é que ela convenientemente ignora que o momento mais importante para considerar o desempenho é na fase de design, antes mesmo de uma tecla bater em um teclado. Se o design ou os algoritmos de um programa são ineficientes, então nenhuma quantidade de polimento dos detalhes mais tarde fará com que ele funcione rapidamente. Ele pode ser executado mais rápido, mas nunca será executado tão rápido quanto um programa projetado para desempenho.

This tends to be far more important in game or graphics programming than in general programming. A performant design, even without low-level optimization, will often run many times faster than a mediocre design with low-level optimization.

Desenho incremental¶

Of course, in practice, unless you have prior knowledge, you are unlikely to come up with the best design the first time. Instead, you'll often make a series of versions of a particular area of code, each taking a different approach to the problem, until you come to a satisfactory solution. It's important not to spend too much time on the details at this stage until you have finalized the overall design. Otherwise, much of your work will be thrown out.

It's difficult to give general guidelines for performant design because this is so dependent on the problem. One point worth mentioning though, on the CPU side, is that modern CPUs are nearly always limited by memory bandwidth. This has led to a resurgence in data-oriented design, which involves designing data structures and algorithms for cache locality of data and linear access, rather than jumping around in memory.

O processo de otimização¶

Supondo que tenhamos um design razoável, e tirando nossas lições de Knuth, nosso primeiro passo na otimização deve ser identificar os maiores gargalos - as funções mais lentas, os frutos de baixa pendência.

Uma vez que melhoramos com sucesso a velocidade da área mais lenta, pode não ser mais o gargalo. Portanto, devemos testar/traçar o perfil novamente e encontrar o próximo gargalo no qual nos concentrar.

O processo é assim:

Perfil / Identificação de gargalo.
Otimize o gargalo.
Volte ao passo 1.

Otimizando gargalos¶

Some profilers will even tell you which part of a function (which data accesses, calculations) are slowing things down.

As with design, you should concentrate your efforts first on making sure the algorithms and data structures are the best they can be. Data access should be local (to make best use of CPU cache), and it can often be better to use compact storage of data (again, always profile to test results). Often, you precalculate heavy computations ahead of time. This can be done by performing the computation when loading a level, by loading a file containing precalculated data or simply by storing the results of complex calculations into a script constant and reading its value.

Once algorithms and data are good, you can often make small changes in routines which improve performance. For instance, you can move some calculations outside of loops or transform nested for loops into non-nested loops. (This should be feasible if you know a 2D array's width or height in advance.)

Always retest your timing/bottlenecks after making each change. Some changes will increase speed, others may have a negative effect. Sometimes, a small positive effect will be outweighed by the negatives of more complex code, and you may choose to leave out that optimization.

Apêndice¶

Bottleneck math¶

The proverb "a chain is only as strong as its weakest link" applies directly to performance optimization. If your project is spending 90% of the time in function A, then optimizing A can have a massive effect on performance.

A: 9 ms
Everything else: 1 ms
Total frame time: 10 ms

A: 1 ms
Everything else: 1ms
Total frame time: 2 ms

In this example, improving this bottleneck A by a factor of 9× decreases overall frame time by 5× while increasing frames per second by 5×.

However, if something else is running slowly and also bottlenecking your project, then the same improvement can lead to less dramatic gains:

A: 9 ms
Everything else: 50 ms
Total frame time: 59 ms

A: 1 ms
Everything else: 50 ms
Total frame time: 51 ms

In this example, even though we have hugely optimized function A, the actual gain in terms of frame rate is quite small.

In games, things become even more complicated because the CPU and GPU run independently of one another. Your total frame time is determined by the slower of the two.

CPU: 9 ms
GPU: 50 ms
Total frame time: 50 ms

CPU: 1 ms
GPU: 50 ms
Total frame time: 50 ms

In this example, we optimized the CPU hugely again, but the frame time didn't improve because we are GPU-bottlenecked.