Saturday, June 28, 2008

Parallel Computing with MPI - Part VI: Parallel Computing in a Nutshell

Finally, we reach the most interesting part of this series. In this post, I will present, in a nutshell, the parts that you need to know in order to get started with parallel computing. I will base this discussion on an article that I wrote in 2005 on the CFD-Wiki. You can find it here. Of course, a lot has changed since then, but the basics and semantics are exactely the same. Any improvement that will happen in parallel computing will be at the implementation level, hardware, and additional functionality. The base has been laid out nicely and so it will remain for a while.

This article will be a very long one, albeit I hate to write long articles on a blog. So if you would like me to split it, just drop me a comment or an email.

What is Parallel Computing?

Parallel computing may be defined as the simultaneous use of several processors to execute a computer program. In the process, inter-processor communication is required at some point in the code; therefore, the code has to be instructed to run in parallel

For example, a code that computes the sum of the first N integers is very easy to program in parallel. Each processor will be responsible for computing a fraction of the global sum. However, at the end of the execution, all processors have to communicate their local sums to a single processor for example that will add them all up to obtain the global sum.

Types of Parallel Computers

There two types of parallel architectures
  1. Shared memory multiprocessor
  2. Distributed memory multicomputer
In a shared memory multiprocessor, all processors share the same "memory addresses". Imagine that these processors are all on the same motherboard. Multi-core CPUs (dual cores, quad cores) and multi CPU socket motherboards, such as server motherboards (or the computers we have in our offices at UTSI) are examples of a shared memory multiprocessor machine. In this setup, all processors have direct access to any of the variables declared in a code. This configuration is very attractive from a programming point of view: there is no need to send data from one computer to another. However, the main disadvantage lies in possible conflicts in accessing/altering data that resides in memory (imagine processor one is computing the temperature at a point, while processor two tries to access that same value for possible computation - in this case, there will be a conflict and memory cannot be accessed). This is usually remedied by inserting function calls to make that memory addresses can be accessed without conflict. Of course, this outweighs the benefits of sharing memory addresses and therefore this configuration is no longer popular since conflict management becomes a tedious and ambiguous process.


In a distributed memory configuration, each processor has its own memory addresses and data has to be sent from one processor to another. Computers are usually connected via network switch and data flows over the network from one processor to another. Note, however, that multi-core CPUs as well as multi-socket motherboards CAN use this configuration. In this case, although the memory modules are the same for all processors, each one of them reserves a portion of this memory just as if it existed independently of the others. Also, when multi-core systems are used, data is sent much faster since it does not have to go through a network. The main advantage of this configuration is robustness. For example, new machines can always be added to the parallel computer. Office computers can be used as part of the parallel computer when they are idle, and if the GRAs do not work overnight!



So How Does it Work?

The history of parallel computing extends over a long period of time and may be traced to the early 60's (check this link for a concise historical perspective). However, the most important milestone took place in 1992 when a bunch of scientist (most likely computer guys) were assembled to create a standard for message passing; their job was to formulate a universal framework for exchanging data in a parallel computer. So they retreated for a while and came up with what is called today the Message Passing Interface or MPI. It defines a set of functions and rules on how data is to flow from one processor to another within a parallel computer. Of course, they do not have to do the actual implementation as this is left for third party vendors or the open source community.

So how does parallel computing work? The notion is very simple. A computer code runs on different processors. At one point or another, data has to be exchanged between processors. Therefore, a functional call has to be included at that point to instruct the code to send or receive data. Once the executable reaches that line of code, it allocates some memory on the CPU buffer, copies the data that is to be sent, and sends this data to the receiving computer. On the receiving end, the code is waiting for some data to arrive. Of course, it knows what it will be receiving, therefore it allocates enough space in the CPU buffer and waits for this data. Once the data arrives, it copies it to the designated address in memory. and that's all!

Of course, there are many other ways that this happens, and different implementations do things differently, but the scenario presented above is one viable method. I still have to include some references though.

Cite as:
Saad, T. "Parallel Computing with MPI - Part VI: Parallel Computing in a Nutshell". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-vi.html

Wednesday, June 25, 2008

Parallel Computing with MPI - Part V: MPICH2 and Visual Studio

Now that we have covered the sequential code for the finite difference solution of the 2D heat equation, It is time to start thinking in parallel. But first, we have to set the grounds for this adventure. This will be the topic of this post.

MPI, aka the Message Passing Interface, is a standard that defines a set of functions that allow a programmer to instruct their code to execute tasks in parallel. Of course, for this set of functions to be portable, they have to be compiled as a library. Then, the functions will be visible inside the code and used as necessary. I will defer my discussion on parallel computing to the next post as the sole purpose of this post is to have MPI ready so that we can experiment with it in our next post.

One thing that you have to understand is that MPI is a only a standard, but the implementations vary. This means that you will find several "MPI implementations" out there, some of them are free while others require the purchase of license, but nonetheless, they all offer the same functionality, function names, and datatypes as dictated by the MPI standard.

Fortunately, God created some people who are believe that science is not personal and are willing to take the time and put the effort to implement the MPI standard for free. The most popular implementations are the MPICH and the LamMPI. I use MPICH for windows and LamMPI for MacOSX.

For users of Unix based systems, I am going to assume that they have already enough knowledge on how to compile/install/link a library to their codes. Just do the same for MPI. You can download the MPICH2 implementation here and the LamMPI implementation here.

Now for windows users, I will outline the installation process of MPICH2 so that it is streamlined with Microsoft Visual Studio 2008. Both the express and professional editions can be used. Remember that the express edition can be downloaded for free!

I assume that you already have visual studio setup. There are two versions of MPICH2, namely, the 32 bit and 64 bit. For the time being, let us focus on the 32 bit version. Let us first setup the environment for MPI
  1. Download and install MPICH2 32 bit edition
  2. Add the following to your path
    C:\Program Files\MPICH2\bin
    (in Vista: right click My Computer/Properties/Advanced System Settings/Advanced/Environment Variables and edit the path variable for either the system or user variables - Do the same in XP, but you dont have to go through advanced system setting because it doesnt exist)
  3. Add the following exceptions to your firewall:
    C:\Program Files\MPICH2\bin\mpiexec.exe
    C:\Program Files\MPICH2\bin\smpd.exe
    Usually, you will receive a message to unblock these programs. But in case you don't, you have to add them manually
  4. Register a user account with MPI. Go to
    C:\Program Files\MPICH2\bin\wmpiregister
    and type in the user account and password that will be used to start MPI jobs. You should have the same user account and password on all your cluster machines
  5. Open port 8676. Go to
    C:\Program Files\MPICH2\bin\wmpiconfig
    and select "port" at the top of the list
Note that if you install the 32 bit MPICH2 on a 64 bit windows, the MPICH installation directory will be under "Program Files (x86)".

Now, to setup the environment in Visual Studio we have to tell the compiler where to find the MPI header files and libraries, do the following
  1. Start Visual Studio
  2. Go to: Tools/Options/Projects and Solutions/Visual C++ Directories
  3. In the platform combo box, select win32
  4. In the Directories combo box, select "Include files" and add the following entry
    C:\Program Files\MPICH2\include (or Program Files (x86) for 64 bit windows)
  5. Select "Library files" and add the following entry
    C:\Program Files\MPICH2\lib\ (or Program Files (x86) for 64 bit windows)
Voila!

Now your visual studio should be working like a charm! We'll test these settings in the next post.

Cite as:
Saad, T. "Parallel Computing with MPI - Part V: MPICH2 and Visual Studio". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-v.html

Sunday, June 22, 2008

Parallel Computing with MPI - Part IV: Programming the 2D Heat Equation in C using Visual Studio



In this post, you'll find the code that I have written for the numerical solution of the 2D heat equation. This is the sequential code, of course. By the way, sequential is the opposite of parallel, just in case you didn't know about it. There should be no confusion between sequential programming as all programming languages are sequential, i.e. the code is executed one line at a time.

To make things easy for newcomers, I have not used any extra header files to place the structure definitions as well as the function prototypes. Good programming practice, however, requries that you use header files to include function definitions, header files, and other stuff.

You will find two files over here. The first one, contains only the c code. Although the file extension is cpp, the code is in c. Using a C++ compiler will be perfect in this case.

The second is a rar archive and includes the visual studio solution files for those who wish to use visual studio C++ express. You can download it here. Simply extract the rar archive and open the ".sln" file.

The minimum that is expected from you is to read the whole code and understand before pressing the build or run buttons. I will most likely upload another version that has detailed comments on how things are done.

Table of Contents:

  1. Introducing the Heat Equation
  2. Discretization of the Heat Equation
  3. Stability Analysis and the Courant-Friedrichs-Lewy Criterion
  4. Programming the Heat Equation in C using Visual Studio
  5. MPICH2 and Visual Studio
  6. Parallel Computing in a Nutshell
  7. Measuring Parallel Performance
  8. Essential Ingredients of Message Passing
  9. MPI Warmup
  10. Parallelizing the Heat Equation

Cite as:
Saad, T. "Parallel Computing with MPI - Part IV: Programming the 2D Heat Equation in C using Visual Studio". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-iv.html

Wednesday, June 18, 2008

Parallel Computing with MPI - Part III: Stability Analysis of the Discretized Heat Equation

In today's part of this series, we'll go over a basic stability analysis for the discretized heat equation that was derived in the previous post.

Here, I will follow John Anderson's neat exposition of the stability analysis for difference equations. We start by defining a couple of terms first. Let,


D = exact solution of the difference equation (infinite accuracy computer)
N = numerical solution of the difference equation (finite accuracy computer)
ε = Round off error

then, by definition,

ε = N - D

which is the error between the exact solution and the finite accuracy numerical solution of the difference equation. Our difference equation can therefore be represented by N since we are most likely to solve it on a finite accuracy computer, right? To make headway, we replace T in our difference equation by N. at the outset, we get


but, the numerical solution can be expressed in terms of the round off error and the exact difference equation solution as

If we substitute this into the difference equation, we get
however, the exact solution D to the difference equation already satisfies the discretized equation. Then, D simply cancels out from the above equation, and therefore, the round off error also satisfies the difference equation, i.e.

Since we are solving using a real computer, then round off errors are inevitable. Therefore, we want to study how these errors evolve from one timestep to another. If ε gets smaller or at worst stays the same, then the solution is stable. Otherwise, it gets larger, then the solution is unstable. Mathematically, for a numerical solution to be stable, this can be expressed as


Now, we wish to seek a general form of the round off error. Thankfully, once upon a time, a mathematician called Joseph Fourier showed that any function can be expressed as an infinite series of sines and cosines, or more conveniently, exponentials. Therefore, it is reasonable to assume the following general form for the round off error


where i is the imaginary unit, kn is a wave number, and An(t) reflects the time dependence of the round off error. One may also assume a general form for An(t). Since the error will either grow or decay with time, then one can set


so that the general form of the round off error is


Now, we can substitute this form into the discretized equation accordingly. For example,


Upon substitution and simplification, i.e. dividing by ε(x,y,t), we get


nice! Notice that the infinite sum is no longer needed. Also, the exponential terms inside the parentheses can be collapsed into cosines


This form, albeit quite simple, does not help in the subsequent steps. We need to invoke one extra trigonometric identity


to obtain


Now we take the absolute value of this equation and obtain the stability requirement


we get two cases

Case I:


or

which is always true. So this case does not pose any limitations.

Case II:


or


but, since the square of the sine is always less than or equal to 1, then


going back to the original inequality, for stability, we must satisfy


or


This is known as the Courant-Friedrichs-Lewy or the CFL condition. Although not in the context in which it was originally derived, a CFL condition may be obtained for almost any numerical analysis of differential equation.

In general, if the grid spacing was not uniform, we would have


Note that in Horak and Gruber's paper (Parallel Numerical Solution of 2-D Heat Equation), there is a typo in the stability condition equation on page 49 (first equation on that page). The "2" in the denominator should actually be a "4" as was shown in this post. The CFL condition that the authors refer to actually corresponds to the one dimensional heat equation.

In my next post, I'll post a sequential program for solving the heat equation in C using visual studio.

REFERENCES

1. ANDERSON, J. D. 1995 Computational fluid dynamics : the basics with applications. McGraw-Hill.

2. TANNEHILL, J. C., ANDERSON, D. A. & PLETCHER, R. H. 1997 Computational fluid mechanics and heat transfer. 2nd edn. Taylor & Francis.

Table of Contents:

  1. Introducing the Heat Equation
  2. Discretization of the Heat Equation
  3. Stability Analysis and the Courant-Friedrichs-Lewy Criterion
  4. Programming the Heat Equation in C using Visual Studio
  5. MPICH2 and Visual Studio
  6. Parallel Computing in a Nutshell
  7. Measuring Parallel Performance
  8. Essential Ingredients of Message Passing
  9. MPI Warmup
  10. Parallelizing the Heat Equation

Cite as:
Saad, T. "Parallel Computing with MPI - Part III: Stability Analysis of the Discretized Heat Equation". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-iii.html

Monday, June 16, 2008

Parallel Computing with MPI - Part II: Discretization of the Heat Equation

In this article, we'll discuss how to discretize the heat equation using an Euler explicit differencing schem. The discretization is first order forward in time, and second order central in space. I am assuming that you are somehow familiar with finite difference methods. If not, then discuss it with your therapist.

Again, we are considering the heat equation given by

The underlying idea in discretization is to first divide the domain of interest (i.e. the region over which we are solving the heat equation) into a grid of discrete points (in space and time). When this is done, the derivatives can be approximated using the values of the dependent variable T at the discrete points. For convenience, the grid points are indexed using i in the horizontal x direction, and j in the vertical y direction.

Similarly, the time dimension is divided into time levels n, n+1, n+2... separated by a timestep Δt.

Next, we have to transform Eq. (1) into an algebraic relation at (any) point (i, j) and time level n. To do this, we have to convert each derivative into an equivalent algebraic expression. For the unsteady term, we use a forward Euler scheme as follows


where n denotes the time level and Δt denotes the timestep size. The form given in Eq. (2) is called forward in time because it uses the value of T at the next time level (n+1), which is unknown. In essence, we are solving for T at (n+1)

For the spatial derivatives, we will use explicit central differencing. This means that the discretized form of the spatial derivatives will be evaluated at the current time level n where the values are known. This is written as

and
If we further allow Δx = Δy = δ Upon substitution of Eqs. (2)-(4) into Eq. (1), we obtain the full discretized form of the heat equation


finally, the simplicity of the explicit scheme is clearly visible in Eq. (5) as one can easily solve for the new time level (n+1)
This means that there is no need to solve a linear system of equations. If we had used an implicit method, the RHS of Eq. (6) would be at time level (n + 1) and therefore, one would have to solve a linear system of equations at each time step.

In the next post, we'll discuss the stability of the discretized equation and derive the CFL criterion to ensure stability of the numerical solution given by Eq. (6).


Table of Contents:

  1. Introducing the Heat Equation
  2. Discretization of the Heat Equation
  3. Stability Analysis and the Courant-Friedrichs-Lewy Criterion
  4. Programming the Heat Equation in C using Visual Studio
  5. MPICH2 and Visual Studio
  6. Parallel Computing in a Nutshell
  7. Measuring Parallel Performance
  8. Essential Ingredients of Message Passing
  9. MPI Warmup
  10. Parallelizing the Heat Equation

Cite as:
Saad, T. "Parallel Computing with MPI - Part II: Discretization of the Heat Equation". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-ii.html

Sunday, June 15, 2008

Parallel Computing with MPI - Part I: The Heat Equation

In this series, I will introduce parallel programming concepts with MPI. This series is prompted by the "Numerical Methods for PDEs" class given by Professor Christian Parigger of UTSI.

Of course, since things are done best with concrete examples; our objective will be to parallelize the 2D heat equation (i.e. unsteady diffusion) with constant properties. The parallelization of the 2D heat equation is actually one of the problems on the final exam of Prof. Parigger's class and is based on the paper by Horak and Gruber, Parallel Numerical Solution of 2-D Heat Equation.

Here's an outline of what I am planning to talk about:
  1. Introducing the Heat Equation
  2. Discretization of the Heat Equation
  3. Stability Analysis and the Courant-Friedrichs-Lewy Criterion
  4. Programming the Heat Equation in C using Visual Studio
  5. MPICH2 and Visual Studio
  6. The Essential Ingredients of an MPI Program
  7. MPI Warmup
  8. Parallelizing the Heat Equation (2 or 3 posts - most likely a video tutorial)
The heat equation is a great example to introduce parallel computing because we will use an explicit scheme for the discretization. The reason for the simplicity of the explicit schemes and its suitability for parallel computing will become clear as I progress in this series. (For those who absolutely want to know why it is simpler to parallelize an explicit finite difference scheme, I'll briefly point out that it is because you are not solving a linear system of equations. When using explicit methods, the values at the new time step are explicitly expressed in terms of the values at the previous one and therefore the exchange of values at the interface is straightforward).

So let us first start by introducing the heat equation.

The heat equation (with no heat sources) expresses a balance between the accumulation and the flow of heat within a certain region of space, usually referred to as a control volume. Obviously, the heat equation is not only restricted to heat. Many other physical phenomena share the same form of the heat equation such as mass transfer, momentum transfer, and Shroedinger's equation. In its simplest forms, the heat equation describing heat diffusion in solids (aka conduction heat transfer) is given by

ρ is the density of the material, c is the heat capacity (if we had a fluid, c would be the heat capacity at constant pressure or cp). It describes how much heat a solid (or a fluid element) can store and subsequently raise the temperature as opposed to letting it pass through. k is the thermal conductivity of the solid and describes the resistance to the flow of heat. A higher k means more heat can flow through the material. For example, insulators have a very low thermal conductivity while heat sinks (such as CPU heat sinks) have very large k.

The unsteady term at the LHS of Eq. (1) describes how much heat is accumulated in a control volume as time goes by. On the other hand, the RHS describes how much heat is flowing in the spatial directions.

Fundamentally, the heat equation describes a process called diffusion, which denotes the transfer of "information" (e.g. heat) at the molecular level. This is useful when convection enters the picture because it occurs at the macroscopic level, in the presence of a velocity field associated with fluid flow (I'll explain this more in a future post).

Mathematically, diffusion is almost always described by second derivatives in space (while convection is associated with first derivatives in space). This is clearly visible in Eq. (1) above. A more convenient way of writing the heat equation is by dividing both sides of Eq. (1) by ρc. The outcome isThe new parameter, α, shown in Eq. (2) is called the thermal diffusivity. It is a measure of how much a solid can pass heat or store it. For example, large values of alpha mean that heat can flow through material very fast, while small values indicate a tendency for the material to store heat.

In the next post, we will discretize Eq. (2) using explicit forward differencing in time, and central differencing in space.

Table of Contents:

  1. Introducing the Heat Equation
  2. Discretization of the Heat Equation
  3. Stability Analysis and the Courant-Friedrichs-Lewy Criterion
  4. Programming the Heat Equation in C using Visual Studio
  5. MPICH2 and Visual Studio
  6. Parallel Computing in a Nutshell
  7. Measuring Parallel Performance
  8. Essential Ingredients of Message Passing
  9. MPI Warmup
  10. Parallelizing the Heat Equation

Cite as:
Saad, T. "Parallel Computing with MPI - Part I: The Heat Equation". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/parallel-computing-with-mpi-part-i-heat.html

Best Programming Font

Programming is one of the most noble ways of "communicating" with your computer, not to say "ordering" the machine to labor for you! Of course, after removing the mouse, the keyboard, and that cheap uncalibrated monitor from the picture; programming is the closest one could get to interacting and potentially messing things up in a computer. I find pleasure in programming, as it requires style, finesse, management, and an artistic vision.

Today, most programmers use IDEs (Integrated Development Environment) to write, manage, and compile their codes. The features provided by an IDE are countless. Most of them now include syntax coloring, and most importantly, code completion, thus making it easy for programmers to focus on their objectives rather than deal with unecessary tasks such as a finding a typo somewhere in the code.

I write my programs on windows and occasionally on Linux. I've had a good experience programming on the Mac using Objective-C as well. I was about to make a full transition to unix platforms, but with the introduction of the .Net framework and the new philosophy of the Microsoft people regarding programming, I decided to stick with windows and I program mostly in C# now using Visual Studio. Microsoft made a colossal improvement in its methodologies and the development tools they now provide are excellent. With the new visual studio series (express, and professional editions), I no longer have to worry about the portability of my C programs. Note that Visual Studio 2008 express versions can be downloaded for free! Thanks Microsoft! Aslo, the Visual Studio 2008 professional edition is also available for free, provided you are affiliated with a university. Check the Dreamspark initiative by Microsoft.

Anyway, the point of this post is the font used in the source code editor. Most (i.e. all) programmers prefer to use mono-spaced fonts. The characters in these fonts all occupy the same space in the editor, and therefore, all formatting, spaces, tabs and stuff like that will align perfectly and nicely in your code. The default font that is used in Visual Studio is Courier New. I've been using it since forever. But I got bored with it. After seeing the excellent default fonts in KDE Develop on Linux and in X-Code, I started looking for a font that would substitute for Courier New... and I found it last week! It was already there, on my system, but for some reason I didn't try it in Visual Studio.

The font is Consolas. It is a monospaced font and I simply find it to be excellent for programming. You can download it here.

In Visual Studio, to change the default font
Got to Tools/Options/Fonts and Colors (located under the Environment node to your left)
Voila!

Here's how the Courier New font looks
while the Consolas font looks like this
By the way, the font size that I use is 10.

Cite as:
Saad, T. "Best Programming Font". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/best-programming-font.html

Friday, June 13, 2008

Posting Source Code in Blogger

Finally! I figured that out! you know what I mean, right? Yes... of course you do.

Well, that previous post almost drove me nuts! I clicked on the publish button, quickly previwed the post, and then went to bed.

The next day (i.e. yesterday), I was pleasantly checking my post only to see that all the parts that contain code are a mess!

The culprit? It turned out to be the "less than" or "<" symbol. Obviously, the Google WYSIWYG editor, although very smart and easy to use, does not automatically convert the "less than" (and other) symbols into their html equivalent when they are used withing a blockquote for example. Why do we need to convert in the first place? Because html tags start with a "less than" symbol. So the html code was treating the "<" symbol as the opening of an html tag. The first solution to this is to manually enter the html code for "<" in the html editor of the post. The html equivalent for "<" is &lt;.

The second solution is to wait and hope for the Blogger team to incorporate some features for programmers in the WYSIWYG editor. (thanks Blogger team, by the way).

Finally, after figuring that out, I also wanted to add some syntax highlighting into my code. I found this nice syntax highlighter on google code.

To make this work on blogger, do the following

  1. Download SyntaxHighlighter here
  2. Unzip and upload the following files to your webspace (googlepages is a great place)
    1. SyntaxHighlighter.css
    2. shCore.js'
    3. shBrushCpp.js (or whatever code you wish to use on your blog)
  3. Go to your Dashboard/Layout/Edit HTML
  4. Add the following code right after the tag
    <!-- end outer-wrapper -- >

    <link href='YOUR_GOOGLE_PAGES/SyntaxHighlighter.css' rel='stylesheet'
    type='text/css'/>
    <script language='javascript' src='YOUR_GOOGLE_PAGES/shCore.js'/>
    <script language='javascript' src='YOUR_GOOGLE_PAGES/shBrushCpp.js'/>
    <script language='javascript'>
    dp.SyntaxHighlighter.BloggerMode();
    dp.SyntaxHighlighter.HighlightAll('code');
    </script>

Now to add source code to a post, it should be placed in <pre> tag. Obviously, you have to do this within the "Edit Html" tab of the Google editor. The typical format that I use is
<pre name="code" class="Cpp">
...insert code here...
</pre>

voila!

Cite as:
Saad, T. "Posting Source Code in Blogger". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/posting-source-code-in-blogger.html

Wednesday, June 11, 2008

2D Arrays in C Using malloc

Pointers can be easily used to create a 2D array in C using malloc. The idea is to first create a one dimensional array of pointers, and then, for each array entry, create another one dimensional array. Here's a sample code:
double** theArray;
theArray = (double**) malloc(arraySizeX*sizeof(double*));
for (int i = 0; i < arraySizeX; i++)
theArray[i] = (double*) malloc(arraySizeY*sizeof(double));
Voila!

What I usually do is create a function called Make2DDoubleArray that returns a (double**) and then use it in my code to declare 2D arrays here and there

double** Make2DDoubleArray(int arraySizeX, int arraySizeY) {
double** theArray;
theArray = (double**) malloc(arraySizeX*sizeof(double*));
for (int i = 0; i < arraySizeX; i++)
theArray[i] = (double*) malloc(arraySizeY*sizeof(double));
return theArray;
}
Then, inside the code, i would use something like
double** myArray = Make2DDoubleArray(nx, ny);
Voila!

Of course, do not forget to remove your arrays from memory once you're done using them. To do this

for (i = 0; i < nx; i++){
free(myArray[i]);
}
free(myArray);


UPDATE: JULY 10, 2008
As was noted by an anonymous comment, a great advantage of declaring a 2D array in this manner is the ability to use it as myArray[i][j], a very intuitive form.

Cite as:
Saad, T. "2D Arrays in C Using malloc". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/2d-arrays-in-c-using-malloc.html

Tuesday, June 10, 2008

Multiple Instances of Powerpoint

Sometimes I find it very useful to run multiple instances of powerpoint, i.e. have a different presentation in a different powerpoint window so that I can compare the content of two different presentations. This is especially useful when you have multiple montiors, so that you can have a different Powerpoint presentation on each monitor. Unfortunately, Powerpoint does not allow multiple instances to run; however, there is a neat trick to achieve this.
  1. Create a new user on your machine
  2. Locate the Powerpoint icon in your start menu (or search for Powerpoint in your start menu)
  3. Press "Shift" and right click on the icon
  4. select "Run as"
  5. Enter the credentials of the newly created user
Voila!

Now you have a new Powerpoint instance that you can place on a second monitor.

Alternatively, for the "programmatically" inclined users, you can invoke the "Run as" command through the command shell
  1. Open a new command shell (Start menu/ search for "cmd")
  2. Type the following
    runas /user:username "C:\Program Files\Microsoft Office\Office12\POWERPNT.EXE"
  3. Press enter
Voila!

You will be asked to type in the username password, of course. Also, don't forget to replace username with the newly created user account.

Finally, the best way is to put the above command in a batch file that you can place on your desktop. To do this, simply open up a new notepad document, paste the above command, and save the file with the ".bat" extension (without quotes). All you need to do from now on is simply double click that batch file to run a new instance of Powerpoint.

Cite as:
Saad, T. "Multiple Instances of Powerpoint". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/multiple-instances-of-powerpoint.html

Monday, June 9, 2008

Compute the Number of Operations in an Algorithm

To compute the number of floating point operations in a numerical algorithm; first, write down your algorithm as pseudocode. Then simply count the number of arithmetic operations that the algorithm is performing (per iteration if the algorithm loops). All operations (addition, subtraction, multiplication, and division) are usually counted to be the same, i.e. 1 flop. This is not exactly true, since multiplication includes several additions and division includes several multiplications when actually executed by a computer. However, we are looking for an estimate here, so it is reasonable to assume that on average, all operations count in the same manner.

Here is an example (just for illustration):

for i = 0 to P
for n = 1 to N (number of elements in array)
B(n) = a(n)*a(n-1) - 2*c(n) + 3
next n
next i
For each n there are 2 multiplications, 1 subtraction, and 1 addition resulting in 4 operations. This loop is executed N times, so there are 4N operations. This is the the order of the algorithm. In this example, its is O(4N) or simply O(N) (constants do not count).

For all iterations, there are 4N(P+1) operations. (remember, when starting the loop from 0 to P, there are P+1 steps).

I posted this article in 2005 in the FAQ section of the CFD-Wiki. You'll find the original post here.

Cite as:
Saad, T. "Compute the Number of Operations in an Algorithm". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/compute-number-of-operations-in.html

Sunday, June 8, 2008

CPU Time Used in a Mathematica Session

To compute the CPU time used to perform computations/evaluations in Mathematica, simply invoke the following command
TimeUsed[]
Voila!

You can insert it at any location in your Mathematica notebook. It will give you the CPU time in seconds.

Cite as:
Saad, T. "CPU Time Used in a Mathematica Session". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/cpu-time-used-in-mathematica-session.html

Thursday, June 5, 2008

Polar to Cartesian Transformation in Fluent UDF

Download udf c file

When simulating a certain problem in CFD, the closer your initial values are to the exact solution, the faster your numerical solution will converge. Now of course, if we had the exact solution, who cares about CFD? That is true, but most of the time we have simplified models that describe the flowfield in a certain geometry. Therefore, it makes sense to initialize the numerical data with the quasi-exact analytical model.

On the other hand, there is a technique that I use to make high quality streamline animations for my analytical models. (I will explain this in a subsequent article). The bottom line is that I need to initialize my mesh with the analytical solution. Now this solution is most often in cylindrical/polar coordinates and Fluent does not offer that luxury (and I agree, because it is easier to write a general code for the Cartesian form of the fluid flow equations). Fluent reads the mesh and solves the problem using Cartesian coordinates and all cells are defined using (x,y,z), while all velocities refer to the x,y, and z components respectively.

It is very easy to initialize the Fluent model with your analytical solution when it is in cartesian coordinates. For polar/cylindrical models, a little bit of extra programming is required.

I have written the code to do this for 2D polar coordinates. The 3D version will follow in the next article as generalization is required (to account for axis and origin locations - In case your axis is in the z direction and located at (0,0), then you can simply extend the attached code by adding the z component for the velocity).

Please take a moment to look at the file. Everything is commented clearly and the locations where you need to make changes are capitlized. You basically have to modify the origin location and the analytical solution.

(I assume you have knowledge of how to hook a udf file to fluent. I will prepare an article on that in the future).

Download udf c file

Cite as:
Saad, T. "Polar to Cartesian Transformation in Fluent UDF". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/polar-to-cartesian-transformation-in.html

Monday, June 2, 2008

Computing Derivates in Fluent UDFs

To compute the derivatives at cell centers in a user defined function, one can make use of the several macros provided by Fluent, such as C_DUDY(c,t). However, this will not work for the inviscid solver. It will only work when viscous effects are included (laminar or turbulence models). I will expand on this in the articles to follow.

Cite as:
Saad, T. "Computing Derivates in Fluent UDFs". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2008/06/computing-derivates-in-fluent-udfs.html