Page 1 of 1

What can be done to speed this up?

Posted: Sun Feb 21, 2010 5:14 pm UTC
by Aldarion
For a couple of months I've been playing with a simulation of a simple variant of solitaire.
The game goes like this:
Shuffle the deck, put four cards aside.
Arrange the remaining 48 cards face up in a 4x12 grid, assign to each row a suit and to each column a value from 2 to K (no aces), like this:

2S 3S 4S ... JS QS KS
2D 3D 4D ... JD QD KD
2C 3C 4C ... JC QC KC
2H 3H 4H ... JH QH KH

Now take the first of those remaining four cards. Note its value. If it's an ace, just put it on the right of the appropriate king. If not, put it at its right place in the grid, take the (wrong) card that was there and continue doing the same. Eventually you will arrive at an ace, which ends the "chain"... but you have three more cards set aside!
If at the end the whole deck is perfectly ordered, you've won. Yes, you win even if you don't have to actually touch every single card: all that counts is the order.

Anyway, I've written a simple enough simulation that just plays the game on a user-specified number of shuffled decks and counts the wins. Here's the code:

Code: Select all

#include <iostream>
#include <cstdlib>
#include <time.h>
#include <float.h>

int main(int argc, char* argv[]) {
   //Define the deck as an array of 52 shorts.
   short deck[52]={0};
   short columns[53] = {0};
   short rows[53] = {0};
   int i;
   int iter;
   int tmp;
   int k;
   int n;
   short matrix[4][13]={0};
   short hand[4]={0};
   int row;
   int col;
   short currcard=0;
   div_t position;
   short tempcard=0;
   bool win;
   int bound=0;

   long int Counter = 0;

   //Prepare the rows and columns array:
   for(i=1;i<53;i++){

      position = div(i, 13);
      rows[i] = position.quot;
      columns[i]=position.rem-1;
      if (columns[i]==-1){
         columns[i]=12;
         rows[i]--;
      }

   }

   srand(time(NULL));
   std::cout<<"How many iterations? ";
   std::cin>>bound;

   for(iter=0;iter<bound;iter++){

      //Populate the deck.
      for(i=0;i<52;i++)
         deck[i]=i+1;
      //Reset the matrix.
      for(row=0;row<4;row++)
         for(col=0;col<13;col++)
            matrix[row][col]=0;
      
      //Shuffle the deck.

      n = 52;            // The number of items left to shuffle (loop invariant).
      while (n > 1)
      {
         n--;                         // n is now the last pertinent index
         k = rand()%(n+1);  // 0 <= k <= n.
         tmp = deck[k];
         deck[k] = deck[n];
         deck[n] = tmp;
      }


      //Populate the matrix with the first 48 cards.
      i = 0;

      for(row=0;row<4;row++)
         for(col=0;col<12;col++){
            matrix[row][col]=deck[i];
            i++;
         }

         //Take the last four cards into the hand.
         hand[0]=deck[48];
         hand[1]=deck[49];
         hand[2]=deck[50];
         hand[3]=deck[51];

         
         //Start main loop of the game.
         for(i=0;i<4;i++)
         {
            //Load the current card from the hand into currcard.
            currcard=hand[i];

            while (currcard>0)
            {

               //Get the card currently in that position.
               tempcard = matrix[rows[currcard]][columns[currcard]];

               //Put the current card in its rightful position.
               matrix[rows[currcard]][columns[currcard]] = currcard;

               currcard = tempcard;

            }
         }

         //Check whether the game was successful.
         win = true;

         for(i=1;i<53;i++){
            if (matrix[rows[i]][columns[i]]!=i){
               win = false;
               break;
            }
         }

         if (win)
            Counter++;

   }

   std::cout<<"Number of winning games: "<<Counter<<"/"<<iter<<". \n";
   float perc = ((double)Counter/iter)*100;
   std::cout<<"Percentage of wins: "<<perc<<"%. \n";
   system("PAUSE");

}


On my single-CPU 3.0 GHz P4 it crunches through 10,000,000 decks in about 36 seconds, giving 277,777 decks per second.
Can it be optimized further? I'm pretty sure it can, because I'm not that good a coder.

By the way, it appears that you win about 19.7% of the time. And if you count only "pure" wins, i.e. when you have to move every card by yourself, then you win only about 7% of the time.

Re: What can be done to speed this up?

Posted: Sun Feb 21, 2010 5:34 pm UTC
by fazzone
I'm not much of an optimization expert myself either. If you really really need the speed, I'd unroll some of your smaller loops. Past that, there's not much I can see except to hand code it in assembly (which wouldn't be impossibly hard). You could also probably use unsigned char instead of short if you only need 0-52. Or you could use shorts or ints and mask out the cards...but I'm not sure if that would be much of a performance gain.

P.S: in C++, it is considered good practice to postpone variable declaration as long as possible.
Edit: are you compiling with -O3 or whatever relevant flags your compiler uses? If you're on a Linux platform, I've observed that the Intel C++ compiler often generates faster code than GCC, so you may want to look into that at http://software.intel.com/en-us/articles/non-commercial-software-download/
Edit 2: Silly me, you must not be on Linux if you use system("PAUSE") (getchar() is often a more portable way of accomplishing the same task)

Re: What can be done to speed this up?

Posted: Sun Feb 21, 2010 5:45 pm UTC
by Aldarion
Yes, I'm on Windows... Visual Studio 2005, to be precise.

Now, the big question is: why 19.7% ? I think it's simple enough to permit some mathematical analysis... I know that in the case of a single suit the probability of a win is e-1 (as the number of cards goes to infinity), and there are proofs of that, but I found nothing on 2+ suits.

Re: What can be done to speed this up?

Posted: Sun Feb 21, 2010 7:09 pm UTC
by fazzone
Well to be rigorous and be much more sure of correct results, you would have to use a serious random number generator (such as the Mersenne Twister) instead of the Linear Congruential Method (LCM) based approach that your libc probably uses (I know it's what glibc uses).

Re: What can be done to speed this up?

Posted: Sun Feb 21, 2010 7:27 pm UTC
by lulzfish
Yes, the default rand() in libc sucks. A lot.

Re: What can be done to speed this up?

Posted: Sun Feb 21, 2010 8:51 pm UTC
by operator[]
I simplified your code quite a lot, and hopefully it will run a bit faster too since the compiler can optimize it easier. Here you go:
Spoiler:

Code: Select all

#include <iostream>
#include <algorithm>
#include <cstdlib>
#include <ctime>
using namespace std;

int main(int argc, char** argv){
   typedef unsigned char card_t;
   typedef int count_t;
   //If you want to support more than ~2^32 tests, change count_t to some >=64-bit type, like long long or __int64.
   card_t deck[52];
   card_t hand[4];
   card_t invalid=(card_t)-1;
   count_t bound=0;
   count_t counter=0;
   
   srand(time(NULL));
   cout << "How many iterations? ";
   cin >> bound;
   
   for(int i=0;i<52;i++){
      deck[i]=i;
   }
   
   for(count_t iter=0;iter<bound;++iter){
      //Shuffle the deck
      //Pass a function as the last parameter to random_shuffle if you want to use something else than the standard
      // rand(). The function should take one integer parameter N and return a random integer in the range [0,N).
      random_shuffle(deck, deck+52);

      //Pick up the first four cards
      for(int i=0;i<4;++i){
         hand[i]=deck[i];
         deck[i]=invalid;
      }
      
      //Start main loop of the game
      for(int i=0;i<4;++i){
         //Load the current card from the hand into curr
         card_t curr=hand[i];
         while(curr!=invalid){
            swap(curr,deck[curr]);
         }
      }
      
      //Check whether the game was successful
      for(int i=4;i<52;++i){
         if(deck[i]!=i){
            goto next;
         }
      }
      
      ++counter;
      next:;
   }
   
   cout << "Number of winning games: " << counter << "/" << bound << ". \n";
   double perc = 100.0*counter/bound;
   cout << "Percentage of wins: " << perc << "%. \n";
   //system("pause");
   return 0;
}

Re: What can be done to speed this up?

Posted: Mon Feb 22, 2010 5:37 am UTC
by phlip
It may be worth noting that you don't need to actually rearrange the cards as you're playing... it's enough for you to jump from one card to the next until you reach an ace. As long as the total number of cards you see is 52 minus the number of cards originally dealt into the correct location, you win.

So, like, you draw KS, in the KS space is 2H, and in the 2H space is AD... that's three cards seen... don't bother moving anything, just add 3 to some counter, and draw the next card.

Because if you were moving the cards around, you would've moved KS, 2H and AD, none of which were in the right positions, into the right positions... an increase in the number of correct cards by 3. And you will never touch those cards, or those spaces, again while you're playing, so it doesn't matter if you store the movement or not.

So the main loop can be more like:

Code: Select all

random_shuffle(deck, deck+52);

count = 0;
for (int i = 4; i < 52; i++) // remove these three lines to play the "you must touch every card" variant
  if (deck[i] == i)
    count++;
for (int i = 0; i < 4; i++)
{
  count++;
  int tmp = i;
  while (deck[tmp] >= 4)
  {
    count++;
    tmp = deck[tmp];
  }
}
if (count >= 52)
  count_wins++;
(Not tested in any way, treat as pseudocode)

Re: What can be done to speed this up?

Posted: Mon Feb 22, 2010 10:11 am UTC
by Aldarion
Thanks a lot, guys!

operator[], your code does run faster than mine: down from 39s to 36s for 10,000,000 decks.
I was just stuck with the idea of accurately modeling the game, that I didn't even think of getting rid of the matrix.

philip, your code runs faster still: 34s for ten megadecks. When I saw operator[]'s linear solution I had a thought of a noting (instead of actually rearranging) approach, but your implementation is more elegant than what I had in mind.

Now, about the PRNG, I don't think it would change the results in any way... I mean, even the simplest shuffle can turn a winning combination into a losing one and vice versa, and anyway it doesn't seem that some bias in the numbers should favour winning or losing decks*. What a different PRNg will probably do, though, is slow down the simulation...

*By the way, during an earlier attempt to optimize the simulation I made a most interesting bug: the following code seems to generate only winning decks:

Code: Select all

#include <iostream>
#include <cstdlib>
#include <time.h>
#include <float.h>

int main(int argc, char* argv[]) {
   //Define the deck as an array of 52 shorts.
   short deck[52]={0};
   short columns[53] = {0};
   short rows[53] = {0};
   int i;
   int iter;
   int tmp;
   int k;
   int n;
   short matrix[4][13]={0};
   short hand[4]={0};
   int row;
   int col;
   short currcard=0;
   div_t position;
   short tempcard=0;
   bool win;
   int bound=0;

   long int Counter = 0;

   //Prepare the rows and columns array:
   for(i=1;i<53;i++){

      position = div(i, 13);
      rows[i] = position.quot;
      columns[i]=position.rem-1;
      if (columns[i]==-1){
         columns[i]=12;
         rows[i]--;
      }

   }

   srand(time(NULL));
   std::cout<<"How many iterations? ";
   std::cin>>bound;
   
   for(iter=0;iter<bound;iter++){
      

      //Populate the matrix:
      for(i=1;i<53;i++)
         matrix[rows[i]][columns[i]] = i;


      //Shuffle the deck.

      n = 52;            // The number of items left to shuffle (loop invariant).
      while (n > 1)
      {
         n--;                         // n is now the last pertinent index
         k = rand()%(n+1);  // 0 <= k <= n.
         tmp = matrix[rows[k]][columns[k]];
         matrix[rows[k]][columns[k]] = matrix[rows[n+1]][columns[n+1]];
         matrix[rows[n+1]][columns[n+1]] = tmp;
      }


      //Populate the matrix with the first 48 cards.
   
      //Take the last four cards into the hand.
      hand[0]=matrix[0][12];
      hand[1]=matrix[1][12];
      hand[2]=matrix[2][12];
      hand[3]=matrix[3][12];

      matrix[0][12] = 0;
      matrix[1][12] = 0;
      matrix[2][12] = 0;
      matrix[3][12] = 0;

      
      //Start main loop of the game.
      for(i=0;i<4;i++)
      {
         //Load the current card from the hand into currcard.
         currcard=hand[i];

         while (currcard>0)
         {
            //Get the card currently in that position.
            tempcard = matrix[rows[currcard]][columns[currcard]];

            //Put the current card in its rightful position.
            matrix[rows[currcard]][columns[currcard]] = currcard;

            //Set the current card to be the picked up card.
            currcard = tempcard;            

         }
      }


      //Check whether the game was successful.
      win = true;
      i=1;
      for(row=0;row<4;row++)
         for(col=0;col<13;col++)
         {
            if (matrix[row][col]!=i)
               win = false;
            i++;
         }

         if (win)
            Counter++;

   }

   std::cout<<"Number of winning games: "<<Counter<<"/"<<iter<<". \n";
   float perc = ((double)Counter/iter)*100;
   std::cout<<"Percentage of wins: "<<perc<<"%. \n";
   system("PAUSE");

}

Re: What can be done to speed this up?

Posted: Mon Feb 22, 2010 12:43 pm UTC
by 0rm
Dunno if it makes as much of a difference, but would using a 1D array be faster than using a 2D array in your matrix array?

Re: What can be done to speed this up?

Posted: Mon Feb 22, 2010 1:36 pm UTC
by Aldarion
Yes, and operator[] has already done so.