Adding Support for Textures in Engine


Textures are n-dimensional arrays which can contain any information. Usually when talking about textures, we talk about 2D arrays containing color even though there are other types of textures such as height maps, normal maps, specular maps, environment maps and others. Some of them I will be adding to my engine in the later stages. This post is about adding support for 2D textures containing color. Since textures are just arrays, they can be accessed as an array by using co-ordinates. These co-ordinates are called as Texture Co-ordinates or TEXCOORDs or UVs (which is how we will describe them from now on). We get these UVs from mesh authoring programs such as maya. I am programming these UVs to my mesh format and then exporting them as binary. We also specify which texture we want to use for a material in our material file. If there is no texture file specified, we use the default texture which is just white.

Material =
EffectLocation = "Effects/Effect1.Effectbinary",
ConstantType = "float4",
ConstantName = "Color",
ConstantValue = {0,0,0,1},
TextureLocation = "Textures/water.jpg",

We then add support for textures in our Input layout in C++, vertexLayoutShader and any other shaders that are using textures.

auto& positionElement = layoutDescription[2];
positionElement.SemanticName = "TEXCOORD";
positionElement.SemanticIndex = 0; // (Semantics without modifying indices at the end can always use zero)
positionElement.Format = DXGI_FORMAT_R32G32_FLOAT;
positionElement.InputSlot = 0;
positionElement.AlignedByteOffset =offsetof(eae6320::Graphics::VertexFormats::sMesh, u);
positionElement.InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
positionElement.InstanceDataStepRate = 0; // (Must be zero for per-vertex data)

We also add a sampler state. It specifies how the texture is to be sampled. There can be only one sampler state for the entire application or there can be multiple which can be bound when needed. I am having only one right now which I bind at the start of application.

Once we add support for these in the application code, we have to then sample the texture in the shaders. We perform that using the ‘sample’ instruction in hlsl in which we pass in the texture and the coordinates and get the color at that coordinate. I am performing the sampling in my fragment shader using the function

SampleTexture2d(g_diffuseTexture, g_samplerState, i_textureData)

This expands to i_texture.Sample( i_samplerState, i_uv ).

I then multiply it with the vertex colors to get the final output color.

Meshes with textures on them.

Adding Diffuse lighting

What is Diffuse Light?
A diffuse reflection is the reflection of light by a surface such that the incident light is scattered at many angles instead of one angle. Diffuse lighting occurs when the light is scattered by centers beneath the surface as shown in the picture. A non-shiny surface would have a very high diffuse value. The amount of diffuse lighting emitted by the surface is controlled by the lambetian reflectance, given by lambert’s cosine law.

Most of the surfaces in the real world are diffuse by nature. The main exceptions include particularly shiny surfaces such as mirrors and metals.
Calculating Diffuse Light
I am using the following equation to calculate the diffuse light

(g_LightColor * (saturate(dot(normalize(g_LightRotation), normalize(i_normals)))) + ambientLight)

Where g_LightColor is the color of the directional light, g_LightRotation is the forward direction of the light. These two values are passed from the application in the per frame constant buffer. i_normals is the value of the normal of that fragment. The normal of a surface is defined as the vector that is perpendicular to the surface.
In the below video you can see that I am simulating a warning light beacon.

Creating shaders based on various object transforms

Before we start discussing the topic, here is a small background about shaders and how data is sent from C++ to GPU.

Shader: Shaders are special programs that reside on the GPU to create custom effects which are not possible/hard to recreate on the CPU side. There are two main types of shaders. They are the vertex and the fragment/pixel shader. As their name suggests, the vertex shader acts on each vertex of a triangle and the fragment shader acts on each fragment that is encompassed by that triangle.

The main output from vertex shader is the position of a vertex with respect to the window coordinates. We can also output any other data we like from vertex shader with the color being the most common. All the outputs from vertex shaders are then passed onto fragment shaders where they are interpolated between the vertices.

Fragment shaders run for each fragment inside the triangle and output a color which is then displayed at the location that is output by the vertex shader. There are many other types of shaders such as geometry shaders, tesselation shaders, and mesh shaders, but we will only use vertex and fragment shaders for now.

When we send data from C++ the primary information that we submit is the vertex data. Apart from this, we can also send data using constant buffers. Constant buffers contain data which is constant over a frame or draw call. In our engine, we have two constant buffers, one is the constant frame buffer, and other is the constant draw call buffer which looks like the following.

struct sPerFrame
Math::cMatrix_transformation g_transform_worldToCamera;
Math::cMatrix_transformation g_transform_cameraToProjected;

Math::sVector g_CameraPositionInWorld;
float padding0;

float g_elapsedSecondCount_systemTime = 0.0f;
float g_elapsedSecondCount_simulationTime = 0.0f;
float padding1[2]; // For float4 alignment

struct sPerDrawCall
Math::cMatrix_transformation g_transform_localToWorld;
Math::cMatrix_transformation g_transform_localToProjected;

To use the above constant buffers, we need to declare them in shaders as shown below. Shaders for Direct3D are written in HLSL while those for OpenGL are written in GLSL. Since we will be working only with Direct3D, we will use HLSL.

cbuffer g_constantBuffer_perFrame : register( b0 )
float4x4 g_transform_worldToCamera;
float4x4 g_transform_cameraToProjected;

float3 g_CameraPositionInWorld;
float g_padding0;

float g_elapsedSecondCount_systemTime;
float g_elapsedSecondCount_simulationTime;
// For float4 alignment
float2 g_padding1;

cbuffer g_constantBuffer_perDrawCall : register( b2 )
float4x4 g_transform_localToWorld;
float4x4 g_transform_localToProjected;

We use the above matrices in our shaders to create effects based on the position of the object relative to the world or camera. Since we also output more data than just position from the vertex shader, I created a struct which contains all the data that I pass from vertex shader and which can be easily accessed in the fragment shader. The struct can be modified as necessary to include additional data.

struct VS_OUTPUT
float4 o_vertexPosition_projected : SV_POSITION;
float4 o_vertexPosition_local : TEXCOORD1;
float4 o_vertexColor_projected : COLOR;
float4 o_vertexColor_local : TEXCOORD2;
  1. Creating a shader that is independent of its position in the world.

To create a shader that has effects which are independent of world position, we need to use the local position of the fragment. This is what I am doing in my fragment shader

o_color = float4(floor(sin(i_VSInput.o_vertexPosition_local.x) / cos(i_VSInput.o_vertexPosition_local.x)),floor(sin(i_VSInput.o_vertexPosition_local.y) / cos(i_VSInput.o_vertexPosition_local.y)),floor(sin(i_VSInput.o_vertexPosition_local.z) / cos(i_VSInput.o_vertexPosition_local.z)), 1.0)* i_VSInput.o_vertexPosition_local;

This produces the following output:

2. Creating effect through which object can move through: The fragment shader output code is similar to the above except instead of using the local vertex position, we use the world position of the vertex.

3. Creating a grow and shrink effect: Until now, we were only modifying the fragment shader, but to create a grow and shrink effect, we need to change the positions of vertices. We do it by creating a scaling matrix. We then modify the scaling matrix value based on the time and multiply it to local position. The rest of the transformations remain the same.

float s = (sin(g_elapsedSecondCount_simulationTime) + 0.5) + 1;
float4x4 scale= {
// Transform the local vertex into world space
float4 vertexPosition_world;
float4 vertexPosition_local = float4( i_vertexPosition_local, 1.0 );
vertexPosition_local = mul(scale, vertexPosition_local);
vertexPosition_world = mul(g_transform_localToWorld, vertexPosition_local);}

Even though this method works, it might not be the most optimized. So, instead of matrix multiplication, we can use scalar multiplication to get the same results.

float s = (sin(g_elapsedSecondCount_simulationTime) + 0.5) + 1;
// Transform the local vertex into world space
float4 vertexPosition_world;
// This will be done in a future assignment.
// For now, however, local space is treated as if it is world space.
float3 scaledLocalPosition = i_vertexPosition_local * s;
float4 vertexPosition_local = float4(scaledLocalPosition, 1.0 );
vertexPosition_world = mul(g_transform_localToWorld, vertexPosition_local);

4. Changing the effect on an object based on its proximity to the camera: To find the distance between the object and camera, we find the length between the camera position and world position of the vertex. I am then lerping between the current vertex color to red based on a clamped value of the distance to the far plane of the camera.

const float4 color = {1,0,0,1};
const float distance = length((g_CameraPositionInWorld - (vertexPosition_world).xyz));
output.o_vertexColor_projected = lerp(color, i_vertexColor_local, saturate(distance/100));

Real Time Rendering – Combining transforms.

;For every object that are being rendered in the scene, we currently have three transformations. They are local space to world space, world space to camera space and finally camera space to projected space or the 2D plane of the screen. All these transformations are performed in the vertex shader. As you all know, vertex shader is called for every vertex in the object. Even though modern GPUs are massively parallel, calculating these matrices are still computationally intensive. Hence, instead of calculating these in the vertex shader, we calculate them in C++ and then pass the one single transform from local to projected space as a constant buffer to the shader, where we can then pass this value to the output. A constant buffer is one which does not change over a draw call. This reduces the number of instructions vertex shaders compile to.

This slideshow requires JavaScript.


As you can see from the above slideshow, the number of instructions of shaders before passing the local to projected transform and after have changed considerably. If we are using, some other transforms such as world to camera or local to world; we can pass them in the constant frame buffer as these transforms will not change throughout a frame.

We calculate local to projected space transform by multiplying local to world, world to camera and camera to projected space matrices together. The order in which we concatenate matrics does not matter as long as the camera to projected transform or the result of it multiplied to other matrices is on the left-hand side of the operation. This is because the camera to projected transform is not an affine transformation.

I am doing in the following way since I want to save the local to camera transform in my per draw call constant buffer for my speciality shaders from assignment 2 to work since it is hard to get back the other transforms from local to projected transform as we have to convert 2D space into 3D space

local to camera = world to camera * local to world;

local to projected = camera to projected * local to camera;


Optimizing the render pipeline

The render pipeline refers to the order in which we first submit data to render, then bind the shaders and finally draw the triangles before switching the buffers. The more optimized the pipeline, the faster we are able to send data to the GPU, which results in higher frame rates and more responsive games. Currently, in the engine, we are binding an effect every time we are drawing an object(which is every draw call). This might not make any impact for small games like mine, but for any moderately sized game, this starts to become a bottleneck.

So, the first thing we need to do is to figure out a way of not binding effects for every draw call i.e., we need to bind an effect once and then draw all the meshes which are using that effect. The way we can accomplish this is by using render commands. Render commands are essentially unsigned integers which represent a rendering operation. There can be many types of render commands such as one for drawing, one for transparency etc. The render command that we use in this assignment is for drawing an object.

I am representing a render command as an unsigned 64-bit integer mainly because of the number of bits such data type can provide. As you will see below, we need as many bits as possible to create render commands, so that we can effectively store data for a particular one. When storing data in a render command, the order is important. The objects which have the highest priority are sorted in MSB and the least are stored in LSB and when we sort all the similar render commands will be sorted from lowest to highest.

Since we need to only bind effect once, it has the highest priority and hence goes in the MSB. We need to make sure that we have enough bits to store all the possible effects we could have in game. I am using 7 bits for effects, which gives me around 128 effects that I can have. The data I am storing for effects is the index at which the effect is stored by the manager. This is similar to how I am storing the meshes in the render command too. Since meshes are not that important for sorting order, I am storing them in the last 7 bits of my render command.

Also when drawing objects, the objects that are closer to the camera must be drawn first compared to objects which are farther. So, we calculate the z-depth from the camera to each object and the scale it with respect to the number of bits we are allocating and then store the value in render command. This has the second highest priority after effects. So, my final render command structure looks like the following.

After storing all the render commands for each mesh, I use std::sort to sort everything in descending order. I then check if an effect is currently bound and if not, bind it, else draw the mesh.

Below you can see two videos captured from various angles to show how all meshes of one effect are drawn first based on the camera distance and then the others.

Below are the screenshots from graphics analyzer showing how one effect is bound first and then all the meshes are drawn before another effect is bound.

The orange rectangles highlight when effects are bound and green shows when meshes are drawn.


Optional Challenges:

The graphics analyzer highlights calls which set states which are already set in the previous frame and are not needed to be set again as shown below.

 ©John-Paul Ownby.

This is happening because I am setting vertex buffers in the draw call to DirectX. To optimize this, I set these buffers only when I switch meshes. The output looks like this.


Engine Feature – Final Update

This is the final part of my Engine feature series. Previous posts are here and here. My main focus this week is to make my library fast and stable, with less bugs. Most of the code changes I’ve done are under the hood.

The major update this week is the support for specifying the trigger and stick deflection dead zones both in Lua and the launcher. I also added support for multiple controller remapping in the launcher.

Adding multiple controller and dead zone support was not much of a work, but i had to think about the way i have to arrange my binary data, so that the file size is not large and it is also easy to retrieve data from it. Since reading binary data is reading from the memory and incrementing the pointer to the next address to read that data, we can inadvertently access data that might not belong to the file. Also, in the case of deadzones, if the player does not enter any data, we do not write anything to the file, but we must have some way of checking that while reading, otherwise we would be reading garbage values. Keeping in mind these things, i’ve decided to add the controller number in front of the controller data and have 1 byte before each deadzone to store where there is data or not in that.

The 0 above yellow line is the controller number and the 0’s above red line indicate that there is no dead zone data.

The updated launcher looks like below.

What the project does:

The project allows you to

  1. Check when a button is pressed on a controller.
  2. Get the actual and normalized deflections for both triggers and sticks
  3. Register and deregister functions that are assigned to specific buttons, which are called when that button is pressed.
  4. Set vibration effects for the motors in the buttons.

How to use the library:

    1. Download the project files for controller input and launcher.
    2. Add code to initialize and clean up the controller input in cbApplication.cpp
    3.  Include the header “Engine/ControllerInput/ControllerInput.h”
    4. All the functions are present in eae6320::UserInput::ControllerInput  namespace
    5. All the functions require passing of the controller keys which are an enum.
    6. Some examples of using the library
if (GetNormalizedStickDeflection(ControllerKeyCodes::LEFT_STICK, 0).x != 0)
		m_Camera->SetAngularSpeed(GetNormalizedStickDeflection(ControllerKeyCodes::LEFT_STICK, 0).x);
	if (IsKeyPressed(ControllerKeyCodes::RIGHT_TRIGGER))
		m_Camera->SetCameraVelocity(Math::sVector(0, GetNormalizedTriggerDeflection(ControllerKeyCodes::RIGHT_TRIGGER), 0));
	if (IsKeyPressed(ControllerKeyCodes::LEFT_TRIGGER))
		m_Camera->SetCameraVelocity(Math::sVector(0, -GetNormalizedTriggerDeflection(ControllerKeyCodes::LEFT_TRIGGER), 0));
    1. To register and deregister functions:
RegisterFunctionForCallback(ControllerKeyCodes::B, [this] {return Test(); }, 0);

  1. To Use the launcher:
    1. Add it as a build dependency to the Game project so that it will build before the game does in the same folder.
    2. If your game name is different from “MyGame.exe”, you need to change it in “MainWindow.xaml.cs” at this line
    3. gameProcess.StartInfo.FileName = "MyGame.exe";
  2. Enjoy!!
  3. Contact me at my email if you encounter any bugs!

What I’ve learned from this project/What I have used in this project that I have learned over the past 6 months:

  1. Creating platform independent interfaces for others to use.
  2. Working with a third party  library. In this case XInput.
  3.  Working with both Lua and binary files.
  4.  Working on binary files in C# and WPF and calling the game from the launcher.

What gave me lot of trouble ;(

  1. Working with function pointers and std::function to use with the callback  feature.
  2. Creating separate paths for different builds in WPF. Since WPF by default builds in a platform agnostic way. I had to make a lot of changes to the project settings to get the launcher to build in the same folder as the game.

Where to go from here:

I am planning to continue working on this project in the future. Since me and hopefully some others in my class have to use this project in the upcoming tech demo game, I might encounter some bugs which i will be patching and updating here. I am also planning on continuing the work i was doing with audio during the winter break whose progress I will updating in future blog posts.


The code provided below is released under the 2-Clause BSD license.

ControllerProject MyGame_SampleController

Engine feature – Progress Update – 2

Updated to the engine feature that I’m developing, detailed here.

As discussed in my previous post , I’ve completed the work on the basic features of my engine and started working on stretch goals and overall making the library better.

Updates for this week:

  1. Button remapping now works with both Lua and Binary files

Users can specify which buttons are to be mapped to which other buttons in the settings.ini file that is present in the game directory. The remapping syntax looks like the following. I had to make sure that it is still easily accessible to the users, while being fast to read in code.

I also created a launcher application in WPF using C#. Which shows a GUI for remapping buttons. The application then writes out a binary file with the new mappings when the user selects OK or keeps the current mappings when cancelled and then opens the game.

My library checks for the existence of both the lua and the binary files and compares the timestamps for these files and opens the file which has the latest edited timestamp. My launcher currently supports remapping of one controller, where as the Lua file supports remapping of all four controllers.

2. Support for function callbacks of subscribed functions.

Users can register functions which are to be called when a particular button is pressed. This is especially useful for buttons such as menu, since the game does not have to check for button press on every update. The library provides functions to register and de-register function callbacks. All the buttons and sticks are supported along with multiple controller support. The register function accepts a function pointer to the required callback function. The syntax looks like the following.

And registering a function looks like the following.

All the callbacks happen asynchronously using std::async. Hence the library does not expect any return values from the callback functions. I have worked with asynchronous calls before using C#, but this is the first time i used it in C++.

Since i have to check the controller state for every tick to check for changes, instead of calling my update function in the core application thread, I have the core update function of my library running on a new thread and use a while loop to continuously check for changes in controller state. Creating a new thread, helps in not bogging down the application thread and better check for changes.

I am using windows specific “CreateThread” function to create thread, and wait for the thread to exit safely before closing the application instead of using “ExitThread“, since it is preferred to return from the function than using ExitThread for C++ applications as discussed in the article.


Things I’m planning on working for next week:

  1. Polish the library
  2. New “really far” Stretch goal: Create a sound library to playback audio from the controller’s headphone jack.

Engine feature – Progress Update -1

Posting updates on the Engine feature that I’m adding, as discussed here

Updates on features:

  1. Able to check when a button is pressed.

Completed. The user can check which buttons are pressed including the triggers.

  1. Get the amount of deflection of triggers and sticks.

Completed. The used can get the actual and normalized (between 0 and 1) values of deflection for both the triggers and sticks.

  1. Built in support for dead zones.

Completed. Dead zones are supported for both triggers and sticks.

  1. Support for haptic vibration. Every xbox 360 controller contains 2 vibration motors. Support to set vibration levels for each individual motors.

Completed. User can set individual vibration speeds for the motors or set the same speed for both.

You can see the interfaces for the above features in the below screenshot:

Updates on stretch goals:

  1. Support for multiple controllers. XInput API supports upto four controllers.

Completed. The functions that are used to get the above data also accept an optional controller number which can be between 0 and 3 and defaults to 0. The library also polls the XInput API for new controllers every 5 seconds.

  1. Support for button remapping using Lua/Binary files.

Working on this feature now. Also planning on adding support to define custom dead zones in the Lua files. Expecting to have a working version of this feature by next week.

  1. Support for pub-sub for events, where instead of the game polling for button presses, the engine will raise an event and call a specific function when the button is pressed.

Yet to start working on this feature. Expecting to finish a basic version of this feature by next week.

Things I’m planning on doing this week:

  1. Have a working version of Lua/Binary file remapping.
  2. Have a working version of event system.
  3. Make the library more performant, by identifying bottlenecks.


This is my first time working with low level APIs and it has been pretty straightforward until now. The XInput API has well written documentation that made writing the code easier. I haven’t faced any roadblocks for this version. But as I develop this module, I expect a few to present themselves on the way.

My main goal in creating this library is to create something that I want to use myself in a game, hence most of the features that I am implementing are those that I have used in the games. For example, having multiple controller support which can be used for local multiplayer. To support this all functions accept a controller input value that defaults to the first controller if the developer does not want to support multiple ones.

Game Engineering – Adding New feature to Engine

Engine Feature Proposal

I am planning on implementing XInput in the engine, so that the engine can support Xbox controllers. It will have the following features

  1. Able to check when a button is pressed.
  2. Get the amount of deflection of triggers and sticks.
  3. Built in support for dead zones.
  4. Support for haptic vibration. Every xbox 360 controller contains 2 vibration motors. Support to set vibration levels for each individual motor.

This information is provided through functions. There will be an enumeration of all the button present on the controller. Functions which poll for a specific button or trigger require the game to send the enum value as a function parameter. There will be no need to create data files in the first iteration of the feature (See stretch goals).

Implementation Details:

I am planning on creating an platform independent interface which exposes all these functionalities as functions. Since XInput is windows specific API, the windows specific code will be hidden behind platform independent functions which will be public. There will not be any data that will be written to files and then read from them for the first version of implementation. So the user will have a set of functions that they can call to get the data from the library.

Stretch Goals:

  1. Support for multiple controllers. XInput API supports upto four controllers.
  2. Support for button remapping using Lua/Binary files
  3. Support for pub-sub for events, where instead of the game polling for button presses, the engine will raise an event and call a specific function when the button is pressed.

Game Engineering – Building Game Engine – Part 9

Creating Human readable and Binary Files for Effects

As discussed in the previous posts, having a human readable file to edit data and having a binary file to load that data is more useful, compared to hardcoding the data in code. In this assignment, we have to create a Human readable format for effects and then convert it into binary format during build time and then load the required data from the binary file during runtime.

Human Readable effect file:

The file contains the locations to vertex and fragment shader and the type of Render state which can be “AlphaTransparency”, “DepthBuffering” or “DrawBothTriangleSides”. The file looks

Effect =
VertexShaderLocation = "Shaders/Vertex/standard.shader",
FragmentShaderLocation = "Shaders/Fragment/standard.shader",
RenderState = "DepthBuffering",


Binary File:

The binary file saves the above information, but also saves the length of the path to Vertex Shader. I am storing this value to find where the path to fragment shader starts in the binary file. If I do not include that info, I have to iterate through every byte to check where the null termination character is and then point the next byte as fragment shader location. I am also adding a null terminator after the vertex and fragment shader locations, so that we can directly load the values as strings.

The input shaders are stored to our $(GameInstallDir)/data folder, but our working directory for game is the $(GameInstallDir) folder. I Chose to append “data/” in front of the file names, so that we do not have to add that during runtime. The upside of this method is that it is faster to load the paths during runtime, while the downside is increased file size. I think the improvements in load speeds offset the increase in file size, hence I included those.


The red rectangle is the Renderstate. The blue one is the length of path for vertex shader and the violet rectangles show the null terminators after each path.


RenderState = uint8_t = 1Byte

Length of Path = uint8_t = 1Byte

Vertex Shader Path = 35 characters = 35 Bytes

null character = 1 Byte

Fragment Shader Path = 37 Characters  = 37 Bytes

null character = 1 Byte

Total = 76 Bytes

Extracting data:

The above screenshot shows the code to extract data from the file. First we are extracting renderstate, length of vertex path, the vertex path. To know where the path to fragment shader starts. We add the length of vertex path plus 1, to account for the null termination character, to the offset.


Since most code is changed in the backend, there are no visible changes to the output.




MyGame_x64_1101 MyGame_x86_1101