Blog

Working With Textures

Filtering of textures:

In texture mapping, we using filtering to smooth textures when they are displayed larger or smaller than they are. The most commonly used filtering method is the bilinear filtering. Bilinear filtering user bilinear interpolation, which is like linear interpolation but in the 2D coordinate space. Bilinear filtering performs this interpolation on four nearest texels of the current pixel. In our engine, we default to bilinear filtering, but we can also change it to no filtering (or nearest neighbor) if we need. Below you can see the difference between bilinear filtering and no filtering.

With filtering:

Without:

As you can observe in the image without filtering, the edges on the numbers are more pronounced when compared to the image with filtering.

Mipmaps:

Mipmaps are images which are the same as the texture image but with lower resolution. The height and width of each mipmap levels is half the previous level with the lowest level being a 1×1 image. Mipmaps increase rendering speed and help in reducing aliasing. Aliasing occurs due to reconstruction of a sampled image at the wrong resolution. Mipmaps reduce this by having a level at particular resolutions which help in reducing aliasing effects when viewed at that resolution.

With Mipmaps:

Without:

Having no mipmaps keeps the texture at the same resolution as the original image which results in the output not getting filtered and smoothed. Below you can see all the mipmap levels for the car texture.

Adding more detail to geometry with alpha channel:

We can make images sharper by discarding fragments which have alpha value that is less than a cutoff. We perform this check in the fragment shader and use the discard() hlsl function to perform this function.

Simple Animations:

Since UVs are just array locations in the texture, we can change how we access the array to achieve some simple animations. One simple animation is to create a flowing water texture which is only a texture, but we are sampling the texture according to time which makes it look like its flowing.

float2(cos(i_textureData_local.x + g_elapsedSecondCount_simulationTime), i_textureData_local.y);

The above line shows the code to achieve the above effect. We can perform this in both vertex and fragment shader, but I am performing it in the vertex shader as it is less expensive than doing it in fragment shader. Also to achieve this, we need the image to tile.

Another effect is having the image rotate around an axis.

Adding Support for Textures in Engine

Textures:

Textures are n-dimensional arrays which can contain any information. Usually when talking about textures, we talk about 2D arrays containing color even though there are other types of textures such as height maps, normal maps, specular maps, environment maps and others. Some of them I will be adding to my engine in the later stages. This post is about adding support for 2D textures containing color. Since textures are just arrays, they can be accessed as an array by using co-ordinates. These co-ordinates are called as Texture Co-ordinates or TEXCOORDs or UVs (which is how we will describe them from now on). We get these UVs from mesh authoring programs such as maya. I am programming these UVs to my mesh format and then exporting them as binary. We also specify which texture we want to use for a material in our material file. If there is no texture file specified, we use the default texture which is just white.

return
{
Material =
{
EffectLocation = "Effects/Effect1.Effectbinary",
ConstantType = "float4",
ConstantName = "Color",
ConstantValue = {0,0,0,1},
TextureLocation = "Textures/water.jpg",
},
}

We then add support for textures in our Input layout in C++, vertexLayoutShader and any other shaders that are using textures.

{
auto& positionElement = layoutDescription[2];
positionElement.SemanticName = "TEXCOORD";
positionElement.SemanticIndex = 0; // (Semantics without modifying indices at the end can always use zero)
positionElement.Format = DXGI_FORMAT_R32G32_FLOAT;
positionElement.InputSlot = 0;
positionElement.AlignedByteOffset =offsetof(eae6320::Graphics::VertexFormats::sMesh, u);
positionElement.InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
positionElement.InstanceDataStepRate = 0; // (Must be zero for per-vertex data)
}

We also add a sampler state. It specifies how the texture is to be sampled. There can be only one sampler state for the entire application or there can be multiple which can be bound when needed. I am having only one right now which I bind at the start of application.

Once we add support for these in the application code, we have to then sample the texture in the shaders. We perform that using the ‘sample’ instruction in hlsl in which we pass in the texture and the coordinates and get the color at that coordinate. I am performing the sampling in my fragment shader using the function

SampleTexture2d(g_diffuseTexture, g_samplerState, i_textureData)

This expands to i_texture.Sample( i_samplerState, i_uv ).

I then multiply it with the vertex colors to get the final output color.

Meshes with textures on them.

Adding Diffuse lighting

What is Diffuse Light?
A diffuse reflection is the reflection of light by a surface such that the incident light is scattered at many angles instead of one angle. Diffuse lighting occurs when the light is scattered by centers beneath the surface as shown in the picture. A non-shiny surface would have a very high diffuse value. The amount of diffuse lighting emitted by the surface is controlled by the lambetian reflectance, given by lambert’s cosine law.


Most of the surfaces in the real world are diffuse by nature. The main exceptions include particularly shiny surfaces such as mirrors and metals.
Calculating Diffuse Light
I am using the following equation to calculate the diffuse light

(g_LightColor * (saturate(dot(normalize(g_LightRotation), normalize(i_normals)))) + ambientLight)

Where g_LightColor is the color of the directional light, g_LightRotation is the forward direction of the light. These two values are passed from the application in the per frame constant buffer. i_normals is the value of the normal of that fragment. The normal of a surface is defined as the vector that is perpendicular to the surface.
In the below video you can see that I am simulating a warning light beacon.

Pathfinding for AI

Assignment 2-Write Up

Path Finding along with movement forms the most basic level of AI techniques and is a fundamental requirement in almost any games. Pathfinding helps in moving the character from one place to another place intelligently in the most efficient path possible. A pathfinding algorithm generates the path from the current position to the final position based on some conditions which we will discuss below. The path generated from the pathfinding algorithms is then resolved using a Path follow algorithm which translates the path to movement and finally this movement is delegated to either arrive or seek to reach the destination.

Before we go into details of the pathfinding algorithms, let us look at the data structures we are using. The main data we use is Node; a node is the representation of a node on the world graph. We also have edges which connect different nodes, and a node can have multiple edges out of it with each edge having a weight that is used to calculate which node is the best path. The next data structure is a Node Record. A node record stores how much was the cost to arrive at a particular node and what is the estimated total amount of cost from that node to the end node. This total estimated cost is calculated as the sum of cost to the node and a value called heuristic. The final data structure is the Path which is a collection of directed weighted edges, and it is what is returned by the pathfinding algorithm.

Different algorithms have different values for the heuristic and different criteria for calculating the heuristic. There can be three broad values for heuristic which are; it can always be underestimated than the actual total cost which is what is used in A*. The second is when heuristic is 0, in which case it is Dijkstra’s algorithm. The final case is when heuristics always overestimate the cost which results in Greedy algorithms. Because of this, Dijstra’s is a special case of A*.

Now let us discuss both Dijkstra’s algorithm and A* in brief.

Dijkstra’s algorithm: Dijkstra works by spreading out from the start node along its connections. As it spreads out to more distant nodes, it keeps a record of the direction it came from, and once it reaches the end node, it will then generate the path by traversing the edges backward to the source node. Dijkstra’s algorithm only considers the cost so far to the current node and does not consider the future cost and since we terminate the algorithm as soon as we find the end node, the path given by Dijkstra might not always be the smallest. The main problem with Dijkstra is that it searches the entire graph for the shortest possible route. This is useful if we’re trying to find the shortest path to every possible node, but not for point-to-point pathfinding.

Dijkstra also has a O(n2) execution complexity and O(n2) memory complexity for the worst case scenario, since it has to go through all the nodes and all the connections of each node. Below is a video showing Dijkstra’s pathfinding. The boid is moving from node 0 to node 6. The nodes are modeled after real-life map as shown below.

 

 

A* is designed to be a better version of Dijstra with fewer comparisons as it takes into account not only the current cost to a node but also estimated cost from the node to the end node.

A*:  The basic algorithm for A* is very similar to Dijkstra, but instead of always considering the open node with the lowest cost to that node, we choose the node that is most likely to lead to the shortest overall path. The node that is selected is chosen by a heuristic. If the heuristic is accurate, then the algorithm will be efficient. If the heuristic is terrible, then it can perform even worse than Dijkstra. For each node, we calculate the estimated total cost from the node to the end node which is the sum of cost until that node and the heuristic. The heuristic is generally calculated as the value of the distance between the two nodes. There are many types of heuristics used for A*.

Some of the common heuristics for A* and that I have implemented in my code are [1]

1.       Manhattan distance: Manhattan distance is the standard heuristic for square nodes which is what I am using. It is defined as the sum of difference is x and y between the two nodes.

dx = abs(node.x - goal.x)</span>
dy = abs(node.y - goal.y)</span></span>
return (dx + dy)</span></span>

2.       Euclidean distance: Manhattan distance does not work well if we want to calculate the path diagonally. To create the path in any direction we use Euclidean distance. It is defined as

dx = abs(node.x - goal.x)</span></span>
dy = abs(node.y - goal.y)</span></span>
return sqrt(dx * dx + dy * dy)</span></span>

3.       Diagonal distance: This is particularly useful for diagonal movement but not for straight line movement.

dx = abs(node.x - goal.x)</span>

  dy = abs(node.y - goal.y)</span></span>

  return (dx + dy) + min(dx, dy)</span></span>

4.       Zero: This returns a zero, which essentially converts A* into dijkstra.

Below is a video showing the A* movement.

The video above and the video for Dijkstra are both generated for the same graph, with the same nodes and with their edges having the same weights, but we can see the difference in how the path is calculated for Dijkstra compared to A*. A* is guaranteed to give the shortest path for any given tree as long as the heuristic is admissible.

The performance for A* can be given as O(lm) for both memory usage and execution time, where l is the number of nodes whose estimated cost is less than that of goal and m is the number of outgoing connections for each node. Since A* takes into consideration the future cost and updates the current node with the new connection when it finds a better one, A* is considerably faster compared to Djikstra for the same number of nodes and graph even though the big O notation looks similar for both.

Implementation Details:

1.       AStar: The algorithm has two priority queues called the openlist and closelist. I have implemented these lists as vectors. The reason why we use priority queues is that we want to sort the nodes that are inserted into these queues according to the estimated total cost. The open list contains all the nodes that are currently being considered and nodes which will be considered next. Closelist contains all the nodes that have already been visited. The algorithm starts with inserting the start node into the open list. Once inserted we get the connections of that node. We then iterate over these connections and end node for each connection and the cost to reach that node.

We then check if the node is in the close list and if it is and has less cost to reach, we remove it from the close list and insert into open list. We then check if the node already is in open list. If it is, then checks if the cost to reach it less than the existing cost, then we will update the connection to that node. Else we will insert into the open list and move the current node to close list. Since I am using a vector instead of custom queues, I use a sort function to sort all the data using std::sort.

bool SortHelperFunction(NodeRecord* n1, NodeRecord* n2) { return n1->m_EstimatedTotalCost < n2->m_EstimatedTotalCost; }
std::sort(openList.begin(), openList.end(), SortHelperFunction);

I sort both open and close list to have the nodes with the least cost at the beginning to access them easily.

Once we find the end node, then I get each node from the end and reverse the vector. The resulting output is a Path structure which contains a vector of DirectedWeightedEdges.

2.       Calculating Heuristics: I have an enum which specifies all the heuristics supported in my implementation. All the heuristics are implemented as static functions that take in two nodes and return a float representing the heuristic. In the Astar class, I have a GetHeuristic function which serves as a wrapper for calling all the different Heuristic functions.

float AStar::GetHeuristic(Node* i_StartNode, Node* i_EndNode)
{
switch (m_Heuristic)
{
case Heuristic::Euclidean:
return Heuristics::Euclidean(i_StartNode, i_EndNode);
case Heuristic::Manhattan:
case Heuristic::Default:
return Heuristics::Manhattan(i_StartNode, i_EndNode);
case Heuristic::Diagonal:
return Heuristics::Diagonal(i_StartNode, i_EndNode);
case Heuristic::Zero:
return 0;
}
return 0;
}
enum class Heuristic
{
Manhattan,
Euclidean,
Diagonal,
Default,
Zero
};
struct Heuristics
{
static float Manhattan(Node* i_StartNode, Node* i_EndNode) { return abs(i_StartNode->m_Position.x - i_EndNode->m_Position.x) + abs(i_StartNode->m_Position.y - i_EndNode->m_Position.y); }
static float Euclidean(Node* i_StartNode, Node* i_EndNode) { return i_StartNode->m_Position.distance(i_EndNode->m_Position); }
static float Diagonal(Node* i_StartNode, Node* i_EndNode)
{
float dx = abs(i_StartNode->m_Position.x - i_EndNode->m_Position.x);
float dy = abs(i_StartNode->m_Position.y - i_EndNode->m_Position.y);
return (dx + dy) + std::min(dx, dy);
}

};

3.       Calculating the movement from the path returned from A*: The A* algorithm returns a Path struct as described above. The resulting path is then inputted to a new movement algorithm called the DynamicPathFollow.

a.       DynamicPathFollow: DynamicPathFollow is a delegated behavior. It delegates the actual job of getting the dynamic output to DynamicArrive and DynamicLookWhereYouGo. The way that I implemented this algorithm is by taking each point in the path and using dynamicarrive to move the boid to that position. I continuously check if the characters position is inside the slow radius and once it moves inside the slow radius, I change the target for dynamic arrive at the next point. This happens until the final point is reached.

4.       Creating a grid instead of graphs: As discussed above I created a graph based on a real-world location. Since this graph only had 11 nodes, it was pretty easy to create by hand. But if we want to create bigger graphs, then it would be impossible to do it by hand. So I implemented a grid class that reads in an image and translates it into a grid using a technique called tiling. In this method, we divide the entire space into tiles with the center of each tile acting as the node. When creating a path for such grids, instead of checking the connections we get the neighbors and have a constant value as the cost between these nodes. Once we need to insert the node records into the open list, we create a new directed weighted edge for the current node and end node and when creating the path, use this edge to find the source and sink.

The following is the image that I am using for my grid generation. The code accepts any image and converts it into a grid. Also, to work with grids, I added a Boolean to check if a particular node can be reached by the boid. This is for collision avoidance.

There are two main ways of doing collision avoidance. One is to do collision detection actively for every frame for the boid with all the objects in the scene. For a graph as big as mine, this uses considerable resources. The second way is to create a path which has collision information built in as part of the node information. This is the way I am doing. Even though this method is resource intensive when the path is getting generated, once the path is generated it would only take resources that are needed for path follow only.

The better the resolution of each tile, the better the collision detection would be. I am using tiles of 3×3 pixels. I have experimented with other values such as 2×2, 4×4, 8×8, etc. The higher resolution ones provide great collision detection but are slow to get the path, move erratically since there are many more nodes in the path.

The one that I have selected has a good balance between speed and performance.

Analysis:

A* and Dijkstra are two of the most famous path following algorithms that are used. Even though the use of Djikstra in production games is very rare, we can get a better look at how it works and how it relates to A*. As shown above, even for same weights and nodes, A* performs a lot better than Dijkstra and that I think is one of the reason it is so widely used in games. I also found that the amount of memory taken by Dijkstra is higher by a couple MB when creating the path for the grid when compared to A*. It also took considerably longer time for Djikstra to find the path for huge graphs. Sometimes the difference between A* and Dijkstra would be more than a few minutes. This difference is not very visible when using the smaller graph as A* is only faster by a couple of milliseconds instead of minutes. I think the reason for this is because of how Dijkstra goes through every node, the numbers of which are very high in my grid. I believe I had about 90,000 nodes in the scene with a 3×3 tile map. This also shows than even though both Dijkstra and A* have similar theoretical big O values for computational complexity, the practical performance is totally different.

Download:

You can download the sample here: Download

Controls on Keyboard

1 – Djikstra

2 – A* with Manhattan

3 – A* with Grid and click to move.

References:

[1] Amit , Heuristics,http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html [website] Available: Stanford.edu

Creating shaders based on various object transforms

Before we start discussing the topic, here is a small background about shaders and how data is sent from C++ to GPU.

Shader: Shaders are special programs that reside on the GPU to create custom effects which are not possible/hard to recreate on the CPU side. There are two main types of shaders. They are the vertex and the fragment/pixel shader. As their name suggests, the vertex shader acts on each vertex of a triangle and the fragment shader acts on each fragment that is encompassed by that triangle.

The main output from vertex shader is the position of a vertex with respect to the window coordinates. We can also output any other data we like from vertex shader with the color being the most common. All the outputs from vertex shaders are then passed onto fragment shaders where they are interpolated between the vertices.

Fragment shaders run for each fragment inside the triangle and output a color which is then displayed at the location that is output by the vertex shader. There are many other types of shaders such as geometry shaders, tesselation shaders, and mesh shaders, but we will only use vertex and fragment shaders for now.

When we send data from C++ the primary information that we submit is the vertex data. Apart from this, we can also send data using constant buffers. Constant buffers contain data which is constant over a frame or draw call. In our engine, we have two constant buffers, one is the constant frame buffer, and other is the constant draw call buffer which looks like the following.

struct sPerFrame
{
Math::cMatrix_transformation g_transform_worldToCamera;
Math::cMatrix_transformation g_transform_cameraToProjected;

Math::sVector g_CameraPositionInWorld;
float padding0;

float g_elapsedSecondCount_systemTime = 0.0f;
float g_elapsedSecondCount_simulationTime = 0.0f;
float padding1[2]; // For float4 alignment
};

struct sPerDrawCall
{
Math::cMatrix_transformation g_transform_localToWorld;
Math::cMatrix_transformation g_transform_localToProjected;
};

To use the above constant buffers, we need to declare them in shaders as shown below. Shaders for Direct3D are written in HLSL while those for OpenGL are written in GLSL. Since we will be working only with Direct3D, we will use HLSL.

cbuffer g_constantBuffer_perFrame : register( b0 )
{
float4x4 g_transform_worldToCamera;
float4x4 g_transform_cameraToProjected;

float3 g_CameraPositionInWorld;
float g_padding0;

float g_elapsedSecondCount_systemTime;
float g_elapsedSecondCount_simulationTime;
// For float4 alignment
float2 g_padding1;
};

cbuffer g_constantBuffer_perDrawCall : register( b2 )
{
float4x4 g_transform_localToWorld;
float4x4 g_transform_localToProjected;
};

We use the above matrices in our shaders to create effects based on the position of the object relative to the world or camera. Since we also output more data than just position from the vertex shader, I created a struct which contains all the data that I pass from vertex shader and which can be easily accessed in the fragment shader. The struct can be modified as necessary to include additional data.

struct VS_OUTPUT
{
float4 o_vertexPosition_projected : SV_POSITION;
float4 o_vertexPosition_local : TEXCOORD1;
float4 o_vertexColor_projected : COLOR;
float4 o_vertexColor_local : TEXCOORD2;
};
  1. Creating a shader that is independent of its position in the world.

To create a shader that has effects which are independent of world position, we need to use the local position of the fragment. This is what I am doing in my fragment shader

o_color = float4(floor(sin(i_VSInput.o_vertexPosition_local.x) / cos(i_VSInput.o_vertexPosition_local.x)),floor(sin(i_VSInput.o_vertexPosition_local.y) / cos(i_VSInput.o_vertexPosition_local.y)),floor(sin(i_VSInput.o_vertexPosition_local.z) / cos(i_VSInput.o_vertexPosition_local.z)), 1.0)* i_VSInput.o_vertexPosition_local;

This produces the following output:

2. Creating effect through which object can move through: The fragment shader output code is similar to the above except instead of using the local vertex position, we use the world position of the vertex.

3. Creating a grow and shrink effect: Until now, we were only modifying the fragment shader, but to create a grow and shrink effect, we need to change the positions of vertices. We do it by creating a scaling matrix. We then modify the scaling matrix value based on the time and multiply it to local position. The rest of the transformations remain the same.

float s = (sin(g_elapsedSecondCount_simulationTime) + 0.5) + 1;
float4x4 scale= {
s,0,0,0,
0,s,0,0,
0,0,s,0,
0,0,0,1,
};
// Transform the local vertex into world space
float4 vertexPosition_world;
{
float4 vertexPosition_local = float4( i_vertexPosition_local, 1.0 );
vertexPosition_local = mul(scale, vertexPosition_local);
vertexPosition_world = mul(g_transform_localToWorld, vertexPosition_local);}

Even though this method works, it might not be the most optimized. So, instead of matrix multiplication, we can use scalar multiplication to get the same results.

float s = (sin(g_elapsedSecondCount_simulationTime) + 0.5) + 1;
// Transform the local vertex into world space
float4 vertexPosition_world;
{
// This will be done in a future assignment.
// For now, however, local space is treated as if it is world space.
float3 scaledLocalPosition = i_vertexPosition_local * s;
float4 vertexPosition_local = float4(scaledLocalPosition, 1.0 );
vertexPosition_world = mul(g_transform_localToWorld, vertexPosition_local);
}

4. Changing the effect on an object based on its proximity to the camera: To find the distance between the object and camera, we find the length between the camera position and world position of the vertex. I am then lerping between the current vertex color to red based on a clamped value of the distance to the far plane of the camera.

const float4 color = {1,0,0,1};
const float distance = length((g_CameraPositionInWorld - (vertexPosition_world).xyz));
output.o_vertexColor_projected = lerp(color, i_vertexColor_local, saturate(distance/100));

Real Time Rendering – Combining transforms.

;For every object that are being rendered in the scene, we currently have three transformations. They are local space to world space, world space to camera space and finally camera space to projected space or the 2D plane of the screen. All these transformations are performed in the vertex shader. As you all know, vertex shader is called for every vertex in the object. Even though modern GPUs are massively parallel, calculating these matrices are still computationally intensive. Hence, instead of calculating these in the vertex shader, we calculate them in C++ and then pass the one single transform from local to projected space as a constant buffer to the shader, where we can then pass this value to the output. A constant buffer is one which does not change over a draw call. This reduces the number of instructions vertex shaders compile to.

This slideshow requires JavaScript.

 

As you can see from the above slideshow, the number of instructions of shaders before passing the local to projected transform and after have changed considerably. If we are using, some other transforms such as world to camera or local to world; we can pass them in the constant frame buffer as these transforms will not change throughout a frame.

We calculate local to projected space transform by multiplying local to world, world to camera and camera to projected space matrices together. The order in which we concatenate matrics does not matter as long as the camera to projected transform or the result of it multiplied to other matrices is on the left-hand side of the operation. This is because the camera to projected transform is not an affine transformation.

I am doing in the following way since I want to save the local to camera transform in my per draw call constant buffer for my speciality shaders from assignment 2 to work since it is hard to get back the other transforms from local to projected transform as we have to convert 2D space into 3D space

local to camera = world to camera * local to world;

local to projected = camera to projected * local to camera;

 

Optimizing the render pipeline

The render pipeline refers to the order in which we first submit data to render, then bind the shaders and finally draw the triangles before switching the buffers. The more optimized the pipeline, the faster we are able to send data to the GPU, which results in higher frame rates and more responsive games. Currently, in the engine, we are binding an effect every time we are drawing an object(which is every draw call). This might not make any impact for small games like mine, but for any moderately sized game, this starts to become a bottleneck.

So, the first thing we need to do is to figure out a way of not binding effects for every draw call i.e., we need to bind an effect once and then draw all the meshes which are using that effect. The way we can accomplish this is by using render commands. Render commands are essentially unsigned integers which represent a rendering operation. There can be many types of render commands such as one for drawing, one for transparency etc. The render command that we use in this assignment is for drawing an object.

I am representing a render command as an unsigned 64-bit integer mainly because of the number of bits such data type can provide. As you will see below, we need as many bits as possible to create render commands, so that we can effectively store data for a particular one. When storing data in a render command, the order is important. The objects which have the highest priority are sorted in MSB and the least are stored in LSB and when we sort all the similar render commands will be sorted from lowest to highest.

Since we need to only bind effect once, it has the highest priority and hence goes in the MSB. We need to make sure that we have enough bits to store all the possible effects we could have in game. I am using 7 bits for effects, which gives me around 128 effects that I can have. The data I am storing for effects is the index at which the effect is stored by the manager. This is similar to how I am storing the meshes in the render command too. Since meshes are not that important for sorting order, I am storing them in the last 7 bits of my render command.

Also when drawing objects, the objects that are closer to the camera must be drawn first compared to objects which are farther. So, we calculate the z-depth from the camera to each object and the scale it with respect to the number of bits we are allocating and then store the value in render command. This has the second highest priority after effects. So, my final render command structure looks like the following.

After storing all the render commands for each mesh, I use std::sort to sort everything in descending order. I then check if an effect is currently bound and if not, bind it, else draw the mesh.

Below you can see two videos captured from various angles to show how all meshes of one effect are drawn first based on the camera distance and then the others.

Below are the screenshots from graphics analyzer showing how one effect is bound first and then all the meshes are drawn before another effect is bound.

The orange rectangles highlight when effects are bound and green shows when meshes are drawn.

 

Optional Challenges:

The graphics analyzer highlights calls which set states which are already set in the previous frame and are not needed to be set again as shown below.

 ©John-Paul Ownby.

This is happening because I am setting vertex buffers in the draw call to DirectX. To optimize this, I set these buffers only when I switch meshes. The output looks like this.

 

Basic AI movement algorithms

Movement forms the most basic level of AI techniques and is a fundamental requirement in almost any games. Each character in the game has a position and some other properties that help in controlling its movement. A movement algorithm uses these properties in finding the next position to move the character. All movement algorithms take data about the current state of character and the world and produce a movement request which is then fed back into the character. Not all movement algorithms require complex input and not all provide a complex output.

There are two basic types of movement algorithms

Kinematic: Kinematic movement algorithms have a single output namely the direction to move. They normally provide only velocity and direction and not accelerations. The characters tend to move in the required direction at full speed, and this results in movement that is not quite realistic.

Dynamic: Dynamic movement algorithms are also called steering algorithms as they provide the accelerations necessary to move the character into a required position. Craig Reynolds first describes steering behaviors for the movement algorithms that he developed. Dynamic algorithms add an extra layer of computational complexity but result in a more realistic movement of characters.

To develop our algorithms, we only consider the character in 2D space, with the 3rd axis representing gravity.

I wrote all the algorithms in C++ with OpenFrameworks. I am depicting the input and outputs as classes and structs in the following manner. The classes and structures as shown below have been evolving as I write more algorithms.

struct DynamicSteeringOutput
{
DynamicSteeringOutput(){}
~DynamicSteeringOutput() {}
DynamicSteeringOutput(ofVec2f linear, float angular) : LinearAcceleration(linear), AngularAcceleration(angular){}
ofVec2f LinearAcceleration = ofVec2f(0,0);
float AngularAcceleration = 0;


DynamicSteeringOutput operator+ (DynamicSteeringOutput& i_Dymamic);
DynamicSteeringOutput operator- (DynamicSteeringOutput& i_Dymamic);
template <typename T>
DynamicSteeringOutput operator* (T i_Other);
};


struct KinematicSteeringOutput
{
ofVec2f Velocity = ofVec2f(0, 0);
float Rotation = 0;
};



class Kinematic
{
public:
ofVec2f Position;
float Orientation = 0;
ofVec2f Velocity;
float Rotation = 0;

void Update(KinematicSteeringOutput i_SteeringInput);
void Update(DynamicSteeringOutput i_SteeringInput);


private:
float mMaxVelocity;
};

The kinematic class is the main class which holds the Position, Rotation, Orientation and Velocity of an object. The update functions in the class take respective Outputs and update the above variables based on the outputs. I also have a boid class which draws a circle and triangle on screen and checks if the boid is out of bounds and wraps around if it does. It also leaves breadcrumbs on the screen so that we can follow the boid around. All algorithms are implemented in their own classes and have a common function named GetSteering, which takes various inputs depending on the algorithm.

Implementing the algorithms:

There are quite a few algorithms that I’ve written to create different behaviors. Kinematic algorithms update both the Velocity and Rotation, which Dynamic algorithms only update one of linear acceleration or angular acceleration. So, to create complex dynamic algorithms, we sum the outputs of different simple algorithms.

The first algorithm I will discuss is ‘Seek.’ Seek is one of the basic movement algorithms, it essentially moves the character to a specific target. We will discuss both kinematic and dynamic seek algorithms. Firstly, kinematic seek. It takes as input the positions of both character and targets and calculates the direction between them both and moves the character in that direction with a maximum velocity that we provide. We can directly set the orientation, if needed, using the arctangent of the velocity. Since the velocity is never set to zero, the character keeps on moving. Hence this algorithm is used for chasing a target that is moving.

Dynamic Seek: Dynamic seek keeps everything the same as kinematic seek, but updates acceleration instead of velocity. Since acceleration increases velocity exponentially, we clamp the max value the velocity can reach.

Kinematic and Dynamic Flee: Flee is similar to seek, but instead of applying acceleration or velocity in the direction of target, we apply them in the opposite direction.

Kinematic Arrive: since kinematic seek never stops once it reached the destination, we need an algorithm that stops the character at the target which is, arrive. The arrive algorithms use a target radius surrounding the target and stop the target as soon as the target enter that radius. If the radius is small, the character moves in the opposite directions to get to target which results in a wiggle like motion. Hence I used a large radius such a 100, which eliminates such wiggle but results in the character not reaching the target.

Dynamic Arrive: Like kinematic arrive, the dynamic arrive also contains a target radius, but it also has a radius bigger than the target radius, the space in which character can slow down. The rate of deceleration is proportional to the distance between character within slow and target radii. Once the character reaches the target radius, we can stop the movement. In my case, I was using a very large slow radius with a very small target radius resulting in a slow stop at the target. Below is a video in which the character can be seen moving to the target which is the mouse click position.

Dynamic Look Where You Are Going (LWYG): As discussed above, all basic dynamic algorithms generate either linear acceleration or angular acceleration. To generate the rotation on the character towards a target as seen above, we use an LWYG algorithm along with dynamic align which we will discuss next. The LWYG takes in the character movement and calculates the orientation in which the character has to align itself to using its velocity.

Dynamic Align: Dynamic align is the main algorithm which creates the angular acceleration that is required to rotate. It calculates the rotation between the character orientation and a target orientation, which can be sent from other places apart from LWYG. It then maps that rotation to one between -180 and 180, calculates the required acceleration and clamps it to max speed and returns this acceleration to us. As dynamic arrive, I used a large slow angle and a small target angle and based on the other algorithm that is calling, set the value of max speed to achieve the desired results as seen in the above video.

Kinematic wander: The wander algorithm generates a random movement in a direction. The kinematic version does this by changing the current orientation and moving the character in the new orientation with maximum speed. To create the orientation, we use a random binomial which generates a random number between -1 and one but generates values nearer to 0 more often. This makes the change in orientation to be random yet small. The same technique is used in dynamic wander too.

[1]

Dynamic Wander: Dynamic wander algorithm is part of a group called delegated behaviors of which LWYG is also a part. All delegated behaviors behave in almost the same way. They generate a target and either position or acceleration and then delegate the steering to other basic dynamic behaviors — for example, LWYG delegates to align. In kinematic wander, we randomly selected an orientation, which can be thought of as selecting a point on a circle at some radius. In dynamic wander, we create the target on a circle at a fixed distance in front of the character.

[1]

I am using arrive to move to the point and LWYG to rotate towards the point. We can also implement that same using dynamic face, and then apply maximum acceleration in that direction. I implemented both the methods to perform wander and found that arrive + LWYG gave me the best results as seen in the video below.

Blending: Blending is the process in which we linearly combine different behaviors. The usual way of doing blending is to assign weights to different algorithms based on our requirements. We multiply the weight of each algorithm to the output and sum all these outputs together to get the final output for our required scenario. The most common blending algorithm is flocking developed by Craig Reynolds and used in many games and movies. Flocking is the behavior of mimicking the movement of a pattern of birds. There are three main behaviors as part of flocking; they are separation, velocity and alignment match, and moving to the center of mass. In our case, we are assigning a mass of 1 to all the boids in which case the center of mass is the center of all their positions. We will discuss individual algorithms below.

Dynamic Separation: In its fundamental state, dynamic separation checks the distance of boid with each other boid in the scene and if it is too close, applies an acceleration in the opposite direction so that the boid moves away. The amount of steering that is applied to the boid is equal to the sum of all the individual forces applied on that boid due to surrounding boids and can also be zero when it is surrounded by an even number of equidistant boids. To make the forces more realistic, we use inverse square law which applies a force is applied inverse to the distance between the two objects.

Dynamic Velocity Match: This algorithm gets the difference between the velocities of character and target and clamps it to a maximum speed. Then sets that as the linear acceleration to the character.

I’ve implemented moving to the center using Dynamic Align and LWYG algorithms.

Flocking can be done only as a group or as a group following a leader. In our assignment, we are supposed to do the second. So for the calculation of the center of mass, I first average the position of all the boids except the leader and then average this average with the position of leader. This gives the leader’s position more weight and lets the followers follow a little behind the leader. All the boids including the leader avoid each other. I’ve made the leader click to move, and even when the leader reaches the required position, you can still see that he is moving around trying to avoid the other boids.

You can see the flocking in the above video. I’ve also implementing changing the leader. If you notice carefully, once the new leader is selected, the old leader joins the pack and the pack moves away from the new leader. Increasing the number of followers might make the screen a bit crowded since, all the boids tend to repel each other. I’ve experimented with having 5 – 10 followers and having 10 followers made it harder for the leader to go to the designated position when we click on screen.

I also added the ability to add a different leader who moves randomly. The followers then follow the leader who is closest to them. You can enable and switch between the secondary leaders by pressing ‘k’ and disable secondary leader by using ‘l’.

Analysis:

Most of the algorithms that I’ve implemented are dynamic algorithms. I also implemented kinematic ones for seek, arrive, and wander but dynamic algorithms are much better when compared to kinematic ones as they provide much smoother motion. The most I spent is on finding the right values for the maximum speeds, accelerations and other values in these algorithms. Since, even a small change in acceleration or speed drastically changed the output, I had to find the right values with whom I’m satisfied enough with the output. I think I can still make small changes to the values to get even more precise movement, but as my professor said in class, we are not trying to find the perfect movement, but if it is good enough to pass without further close inspection, then we are in a good place.

For flocking, I assigned weights of 2,1 ,3 and 2 to separation, velocity match, Arrive and LWYG respectively. I think for a good flock, even if they fall behind sometimes to the main group, they should mainly avoid colliding with others and try to go to the center. If we change the values to other, I believe we can see some different behaviors. I’ve checked with having fractional values as weights for separation and this caused some of the boids to collide with each other while trying to reach the center. Also, currently I have the leader and followers move at the different speeds and different weights at avoidance. Since, the followers always follow the leader, I’ve halved the leader’s avoidance weight and increased its maximum speed to try reaching the destination better.

Adding a secondary leader to flocking changes a lot of things as the boids have to continuously check which leader they are close to which still staying close to the whole of center of mass.

Download:

You can download the sample here: Download. All code is open source and is available at my GitHub.

Controls on Keyboard

press 1 – Basic movement

press 2 – Move to mouse click

press 3 – Wander

press 4 – Flocking

press N – When flocking, to change the leader.

press K – When flocking, to enable secondary leader and change secondary leader.

press L –  When flocking, to disable secondary leader and change secondary leader.

When flocking or seek to mouse, use the mouse left button to set the position.

References:

[1] I. Millington, J. Funge , Artificial Intelligence For Games, 2nd ed. Reading, CRC Press, 2009. [E-book] Available: Safari e-book

Creating a bug report system for our game

It is vital for any game to be bug free for optimal experience for the users. It is also vital for bugs discovered in the game to be reported to the developers, so that we can act on it as soon as possible and provide an update.

Keeping this in mind, I came up with the idea of having a simple bug report system for our game, Split,  which is user facing, so that people who are playing can report bugs. These bugs are then categorized and assigned to developers.

It works in the following way.

Firstly when the user presses the Bug report button (Which is mapped to D-Pad on the controller), we take a screenshot. I then populate a widget using that screenshot along with a text box for the user to enter the bug details. Once the user clicks submit, I save the screenshot as PNG and the text box as a text file. To save them on the disk, the current method that i’m employing is to create separate folders for each bug report and then save the corresponding PNG and text file in them.

We are updating this to include SQL Server, where in we will save the PNG as a BLOB and the text in a table in a database. I’m also planning on creating a WPF application using C# which will basically work as a gallery for these reports for our producers to look at and assign to respective people. I am planning on improving this system by not only including screenshots, but also store video of previous 5 seconds of gameplay.

We are also planning on extending the bug reporting system into a complete telemetry system, where in we will be logging various parts of the players progress to better serve them.

How I’m doing it in unreal:

Unreal has built in tools to take a screenshot of the game. It has a screenshot and a high res screenshot functions to do this job. I am using the screenshot function in my implementation.

First I bind a function to handle the delegate.

GetWorld->GetGameViewport()->OnScreenshotCaptured().AddUObject(this, &ALevelManager::BugReportHelper);

This is the function declaration that I use.

void BugReportHelper(int32 sizeX, int32 sizeY, const TArray<FColor>& i_Bitmap);

Then we use the Request screenshot function to get the screenshot

FScreenshotRequest::RequestScreenshot(false);

After the screenshot is taken, I create a texture with the bitmap data. I was using FImageUtils::CreateTexture2D but found out that this function does not work with packaged builds for console. So below is the code I am using to create the texture.

 UTexture2D* tempTexture = UTexture2D::CreateTransient(sizeX, sizeY);
 if (tempTexture)
 {
  tempTexture->SRGB = 0;
  FTexture2DMipMap& Mip = tempTexture->PlatformData->Mips[0];
  void* Data = Mip.BulkData.Lock(LOCK_READ_WRITE);
  int32 stride = (int32)(sizeof(uint8) * 4);
  FMemory::Memcpy(Data, i_Bitmap.GetData(), tempTexture->PlatformData->SizeX * tempTexture->PlatformData->SizeY * stride);
  Mip.BulkData.Unlock();
  tempTexture->UpdateResource();
 }

I populate this texture on an Image that we show in the UI. I created a Widget that extends from the UUserWidget class. I use this class as a base class for the blueprint version of the widget. After the user enter the bug report and hits submit, I call a function in the widget which is then used to store the data as shown below.

 FHighResScreenshotConfig& hConfig = GetHighResScreenshotConfig();
 hConfig.SaveImage(imageFileName, bitMapData, FIntPoint(X, Y), &resultPath); 

Where bitmap data is the data from the screenshot and X, Y are the dimensions of the screenshot. If the dimensions that are passed and the dimensions in the bitmap data are not the same, then the image will not be saved. 

Game Engineering II – Final Update

I discussed that I’m making a racing game for my final project for the class here. While working on the game, I changed the direction I was going and instead of a normal racing game, I made a drag race game.

You can download the game from the below link.

In this game, you race against either an AI or a 2nd player (if a second controller is connected) and try to win as many races as possible. You have a choice of 3 cars, with different acceleration values that you can choose from. I initially wanted to have an unlock system for the card, but decided against it. You use a controller to control the car, The controls used are the common ones used in racing games, namely 

  1. Accelerate – Right trigger / W
  2. Brake – Left trigger / S
  3. Change camera – Right shoulder button.
  4. Change cars – D-Pad Up and down / M , N
  5. Start / Restart race – Start button / R

My experience building the game: I made a lot of changes to the design of the game, but in the end made a simple drag race simulator. There were a few features that I couldn’t make it in, the most important one is having a split screen multiplayer, since my game already handles multiple inputs. 

Using my own engine component: I made a controller input component as described here. It was very easy to use and I did not have to change anything in the component. 

Using Shantanu’s Audio Engine Component : Shantanu Pandey‘s audio engine component was also pretty easy  to use, with easy to access interfaces to play an audio file. But, I’ve come across some bugs, which I hope are fixed in a future update.

What I learned from this class:
The most important thing that I think I learnt is the importance of the data oriented design, which is a bit different from the traditional object oriented design that we generally see in most programs.

I also learnt that in game programming, most of the times it is better to sacrifice safety for speed as we want the game to have the best possible experience. This includes assuming the way data is stored when retrieving without checking for the integrity as I’ve done for meshes and effects. That also means that sometimes you can use extra memory for cache coherency.

The importance of binary files: As I discussed here, binary files have quite a few advantages which are immensely useful. Not having worked with binary files before, I found then quite fascinating and the class taught me on how to maximize their utility.

I also realized the importance of creating easy to use interfaces to the functions that I wrote while working on my engine component. I saw few components of my classmates which were extremely well written, but did not provide an easy to use interface to use them, which made their use a bit hard. When people who said that it took no effort in implementing my component, I felt good in creating something that people wanted to use.

There are many more things that I learned from this class apart from those I posted above and I am looking forward to the graphics class next semester 🙂