How to learn c#


1 .Net C# Data Types

Data is physically stored inside cells of memory. This memory could be physical memory (Hard disk) or logical memory (RAM). Any cell of memory is represented with a unique address. This address is more than some combination of numbers or symbols.

C# language provides for practically all the data types. These types can be divided in three categories: value types, reference types and pointer types.

There are some more basic concepts to be learnt before the discussion of the data types. This is about variables and constants. A Variable is a named cell of memory used for data storage. A Variable value can be changed anytime. Every variable must have a type and this type must be set before it is used. Qualifying a variable with a type is called declaration of variable. The type of a variable is the most important aspect and it defines the behavior of variable. All variables can be divided into seven main categories depending on the context of usage:

  1. Static variables
  2. Variable of instance
  3. Array’s elements
  4. Parameters given by reference
  5. Parameters given by value
  6. Returned values
  7. Local variables.

Static Variables will be alive throughout the life of a program. It can be declared using static modifier.

An Instance variable is a variable declared without static modifier. Usually it is declared inside a class or structure definition.

Parameter Values can be viewed as input parameters into methods:

public static void Sume(int a, int b)
{
Console.WriteLine(“The sume of elements {0} and {1} is {2}”,a,b,a + b);
}

This code writes in console values of variables a, b and their summary value. Now if the values of these parameters are modified inside the method, this change will not be reflected inside the main function. It is because the compiler creates copies of them when it passes them as value types. This ensures that their original values are not modified.

Instead if one wants to modify the parameter variables inside the function, C# has something called Reference variables. Reference variables also have a modifier out which can be used before their type. Please have a look at the following example:

public static void SumeRef(ref int a, ref int b)
{
a = 4;
b = 6;
Console.WriteLine(“The sume of elements {0} and {1} is {2}”,a,b,a + b);
}

Now this method modifies the value of variables a and b with values 4 and 6. These values are retained even after the execution of the function gets completed.
If the parameters need to be returned then they can be qualified with out modifier or as returned parameter in method definition. Here is an example of both of them, in which both of them return the same value:

public static int SumeOut(int a, int b, out int sume)
{

sume = a+b;
Console.WriteLine(“The sume of elements {0} and {1} is {2}”,a,b,a+b);
return sume;

}

In main function it must be called in the next manner:

int sume ;
sume = SumeOut(2,2, out sume);

Constants in C#:

Constant type of data cannot be changed. To declare a constant the keyword const is used. An example for the constant declaration is: const double PI = 3.1415;

Values types in C#:

Value type stores real data. When the data are queried by different function a local copy of it these memory cells are created. It guarantees that changes made to our data in one function don’t change them in some other function. Let see at a simple example:

public class IntClass
{
public int I = 1;
}

Here we have simple class that contains only one public data field of integer type. Now have a look on its usage in main function:

static void Main(string[] args)
{
// test class
int i = 10;
int j = i;
j = 11;
IntClass ic1 = new IntClass();
IntClass ic2 = ic1;
ic2.I = 100;

Console.WriteLine(“value of i is {0} and j is {1}”,i,j);
Console.WriteLine();
Console.WriteLine(“value of ic1.I is {0} and ic2.I is {1}”,ic1.I,ic2.I);
Console.WriteLine();
}

Reference Types in C#:

In the above example, assume that First we have two value type i and j. Also assume that the second variable is initialized with the value of the first one. It creates new copy in memory and after it the values of these variables will be next:

i = 10;
j = i;

There are a few more things written in the above example for explaining the Reference Types in C#. At first, the variable ic1 of IntClass is created using dynamic memory allocation. Then we initialize the variable ic2 with value of ic1. This makes both the variables ic1 and ic2 referring to the same address. If we change a value of ic2, it automatically changes the value of ic1.

Now, over to the discussions about the important value types used in C#. The category simple types contains some predefined or system types that also are commonly used in other programming languages. It contains integer types: byte, Sbyte, Long, Ulong, Short, Ushort, int, Uint. These common types differs only range of values and sign.

Next simple type is character type. To declare a variable of this type need use keyword char. It can take values of characters or numbers as 16-digit symbol of the type Unicode.

The Boolean type has only two values: true, false. But these values cannot be assigned with a 0 or 1 as in C++ language.

Next category of simple types is floating point types. It contains two types float and double. Float type can get values in range from 1.5*10-45 to 3.4*1038. Double type has range of values from 5.0*10-324 to 1.7*10308.

A structural value types are struct and enum. Struct is a the same as class but it uses real values not references. The following code snippet contains the definition for struct:

struct Point3D
{

public float m_x;
public float m_y;
public float m_z;

public float [] GetArray()
{

float [] arr = new float[3];
arr[0] = m_x;
arr[1] = m_y;
arr[2] = m_z;
return arr;

}

}

The above is declaration for a simple structure of real 3D point. As you see a class declaration looks very similar to the struct except that the class also has a constructor.

Enumerated types can be used to create some set of identifiers that can have values of simple type. Let us see at example of enum type:

public enum Days
{

Monday,
Tuesday,
Wensday,
Thursday,
Friday,
Saturday,
Sunday

}

In example there are enum that has days of week names. The values of days by default are in range from 0 to 6.

Common types in C#:

Object in C# language is universal; it means that all supported types are derived from it. It contains only a couple of methods: GetType() – returns a type of object, ToString() – returns string equivalent of type that called.

Next type is class. It is declared in the same manner as structure type but it has more advanced features.

Interface – is an abstract type. It is used only for declaring type with some abstract members. It means members without implementations. Please, have a look at piece of code with a declaration for a simple interface:

interface IRect
{
int Width
{
get;
set;
}

int Height
{
get;
set;
}

int CalculateArea();
}

The members of interface can be methods, properties and indexers.

Next reference type to be dealt is delegate. The main goal of delegate usage is encapsulation of methods. It most like at pointers to function in C++.

String is common type that is used in almost all programming languages of high level. An example of string declaration and initialization:

string s = “declaration and init”;

The last very used reference type is array. Array it is set of elements that have the same type. Array contains list of references on memory cells and it can contain any amount of members. In C# there are three types of arrays: one-dimentional, two-dimentional and jagged array.

So, this covers almost all types used in C#. All these types can be cast to another type using special rules. An implicit casting can be done with types when values of variables can be converted without losing of any data. There is special type of implicit casting called boxing. This enables us to convert any type to the reference type or to the object type. Boxing example:

// boxing
char ch = ‘b’;
object obj = ch;
Console.WriteLine(“Char value is {0}”,ch);
Console.WriteLine(“Object value is {0}”,obj);
Console.WriteLine();

This piece of code prints the same values of integer type variable and object type variable. The opposite process to the boxing is un-boxing. An example for un-boxing is as follows.

// unboxing
float q = 4.6f;
object ob = q;
Console.WriteLine(“Object value is {0}”,ob);
float r = (float)ob;
Console.WriteLine(“Float value is {0}”,r);

So, it is main item of common data type creating and using. All sources are attached. To compile and run it need to run .NET command line. Just type: csc DataTypes.cs. It creates DataTypes.exe that can be run as standard executable file. You can download the sample code here.

2 .Net C# Tutorial Intermediate Language – MSIL

Microsoft Intermediate Language (MSIL) is a platform independent language that gets compiled into platform dependent executable file or dynamic link library. It means .NET compiler can generate code written using any supported languages and finally convert it to the required machine code depending on the target machine.

To get some clarity in this language we need to write some code. In this tutorial we’ll use a very simple piece of source code in C# .Net. Now we need to compile this code using csc command in Microsoft .NET on the command line. To do this, please type next string: csc ILSample.cs. After it compiler will create an executable file named ILSample.exe.

After this type the command called ILDasm. It will open application called Intermediate Language Disassembler. In file menu choose open and find our executable file.This application opens our assembly showing all structural units viz., classes, methods, data fields and all global and local application data. Now if you click on the method it shows code in intermediate language. This code is most like at language with independent set of instructions.


public double GetVolume()
{
double volume = height*width*thickness;
if(volume<0)
return 0;
return volume;
}


You can get the strings of Microsoft Intermediate Language code using ILDasm:


.method public hidebysig instance float64
GetVolume() cil managed
{
// Code size 51 (0x33)
.maxstack 2
.locals init ([0] float64 volume,
[1] float64 CS$00000003$00000000)
IL_0000: ldarg.0
IL_0001: ldfld float64 OOP.Aperture::height
IL_0006: ldarg.0
IL_0007: ldfld float64 OOP.Aperture::width
IL_000c: mul
IL_000d: ldarg.0
IL_000e: ldfld float64 OOP.Aperture::thickness
IL_0013: mul
IL_0014: stloc.0
IL_0015: ldloc.0
IL_0016: ldc.r8 0.0
IL_001f: bge.un.s IL_002d
IL_0021: ldc.r8 0.0
IL_002a: stloc.1
IL_002b: br.s IL_0031
IL_002d: ldloc.0
IL_002e: stloc.1
IL_002f: br.s IL_0031
IL_0031: ldloc.1
IL_0032: ret
} // end of method Aperture::GetVolume


To clearly understand that IL really is immediately language you should write the same code in VB .NET and look at these two sources in IL. These methods will be almost identical.

The main advantages of IL are:

  1. IL isn’t dependent on any language and there is a possibility to create applications with modules that were written using different .NET compatible languages.
  2. Platform independence – IL can be compiled to different platforms or operating systems

3. OOP & C#

The skeleton of object – oriented programming is of course the concepts of class. This C# tutorial on OOPS explains classes and their importance in implementation of object – oriented principles.

Any language can be called object – oriented if it has data and method that use data encapsulated in items named objects. An object – oriented programming method has many advantages, some of them are flexibility and code reusability.

All the programming languages supporting Object oriented Programming will be supporting these three main concepts:

  1. Encapsulation
  2. Inheritance
  3. Polymorphism

Encapsulation in C#:

Encapsulation is process of keeping data and methods together inside objects. In this way developer must define some methods of object’s interaction. In C# , encapsulation is realized through the classes. A Class can contain data structures and methods. Consider the following class.


public class Aperture
{

public Aperture()
{

}

protected double height;
protected double width;
protected double thickness;

public double GetVolume()
{

double volume = height*width*thickness;
if(volume<0)
return 0;
return volume;

}

}


In this example we encapsulate some data such as height, width, thickness and method GetVolume. Other methods or objects can interact with this object through methods that have public access modifier. It must be done using “.” operator.

Inheritance in C#:

In a few words, Inheritance is the process of creation new classes from already existing classes. The inheritance feature allows us to reuse some parts of code. So, now we have some derived class that inherits base class’s members. Consider the following code snippet:


public class Door : Aperture
{

public Door() : base()
{

}

public bool isOutside = true;

}


As you see to inherit one class from another, we need to write base class name after “:” symbol. Next thing that was done in code Door () – constructor also inherits base class constructor. And at last we add new private field. All members of Aperture class are also in Door class. We can inherit all the members that has access modifier higher than protected.

Polymorphism in C#:

Polymorphism is possibility to change behavior with objects depending of object’s data type. In C# polymorphism realizes through the using of keyword virtual and override. Let look on the example of code:


public virtual void Out()
{
Console.WriteLine(“Aperture virtual method called”);
}
//This method is defined in Aperture class.
public override void Out()
{
Console.WriteLine(“Door virtual method called”);
}


Now we need to re-define it in our derived Door class. The usage of virtual methods can be clarified when we creating an instance of derived class from the base class:

Aperture ap = new Door();
ap.Out();

In such cases, the runtime keeps record of all the virtual function details in a table called VMT(Virtual Method Table) and then in runtime dynamically picks the correct version of the function to be used. Here it uses Out() method from derived class of course.

To compile the attached example you need to run .NET console and run the next command: csc filename.cs .

4 .Net C# Tutorial Namespaces

A Namespace in Microsoft .Net is like containers of objects. They may contain unions, classes, structures, interfaces, enumerators and delegates. Main goal of using namespace in .Net is for creating a hierarchical organization of program. In this case a developer does not need to worry about the naming conflicts of classes, functions, variables etc., inside a project.

In Microsoft .Net, every program is created with a default namespace. This default namespace is called as global namespace. But the program itself can declare any number of namespaces, each of them with a unique name. The advantage is that every namespace can contain any number of classes, functions, variables and also namespaces etc., whose names are unique only inside the namespace. The members with the same name can be created in some other namespace without any compiler complaints from Microsoft .Net.

To declare namespace C# .Net has a reserved keyword namespace. If a new project is created in Visual Studio .NET it automatically adds some global namespaces. These namespaces can be different in different projects. But each of them should be placed under the base namespace System. The names space must be added and used through the using operator, if used in a different project.

Please now have a look at the example of declaring some namespace:


using System;
namespace OutNamespace
{

namespace WorkNamespace
{ /// can be placed some classes, structures etc.
}

}


In this example we create two namespaces. These namespaces have hierarchical structure. We have some outer one named OutNamespace and the inner one called WorkNamespace. The inner namespace is declared with a C# .Net class WorkItem.

The next logical discussion after a namespace is classes. A class is the basis of object – oriented programming. It encapsulates the data and methods into one itself and manipulates them through the interaction with that object.


class WorkItem
{
public WorkItem()
{

}

static WorkItem()
{
m_counter = 1;
}
public static int GetCounter()
{
return m_counter;
}

private static int m_counter;

public virtual void Status()
{

}

internal bool IsWorking
{
get
{
return m_isWorking;
}
set
{
m_isWorking = value;
}
}
private bool m_isWorking = true;
}


The above sample contains the .Net namespace with the class WorkItem inside it.

As already discussed, a class is the basis of Object oriented programming. A class must have a constructor. The constructor method is used for initialization purposes. Each class can have different modifiers which pertains to the type and functionality of the class. Some such modifiers are: new, public, protected, internal, private, abstract, and sealed. A class members can also have these modifiers. In the above example, there is declared special constructor type – static constructor. It uses only for class not for its instances. In the same way we can access to the static members of class.

OutNamespace.WorkNamespace.WorkItem item = new WorkItem();
int i = WorkItem.GetCounter();

In this piece of code there was created some instance of WorkItem but called using its full name (including all namespace’s names). The second line it is an access to the public static property of WorkItem. There are many advanced features that can be used in class constructing process. One of them polymorphism that can be realized though the virtual methods.

Download the sample code from here. It can be compiled using csc command of Microsoft .NET command line. Just only type the following string: csc filename.cs. It creates file with name filename.exe that can be run as standard executable file.

5 C# .Net and Java

This article compares the same program written in the C# and Java languages and then compares the dissembled code of both languages.

Java Hello Program:


class Hello
{

public static void main(String args[])
{
System.out.println(“Hello”);
}

}


Disassembled Java Hello Program:


class Hello
{

Hello()
{
// 0 0:aload_0
// 1 1:invokespecial #1 <Method void Object()>
// 2 4:return
}

public static void main(String args[])
{
System.out.println(“Hello”);
// 0 0:getstatic #2 <Field PrintStream System.out>
// 1 3:ldc1 #3 <String “Hello”>
// 2 5:invokevirtual #4 <Method void PrintStream.println(String)>
// 3 8:return
}

}


Explanation of Java program:

To understand this you must have some knowledge of computer internals concepts for eg. Opcodes, instruction templates etc. I assume that you already know them.

As usual this code will also start with main method. The first line tells that print “Hello” it is a normal print statement. A specific instruction, with type information, is built by replacing the ‘T’ in the instruction template in the opcode column by the letter in the type column. In this case it is a load instruction for type reference.

Invokespecial instruction must name an instance initialization method, a method in the current class, or a method in a superclass of the current class. Class and interface initialization methods are invoked implicitly by the Java virtual machine; they are never invoked directly from any Java virtual machine instruction, but are invoked only indirectly as part of the class initialization process.

invokevirtual or invokespecial is used to access a protected method of a superclass, then the type of the class instance being accessed must be the same as or a subclass of the current class.

The Java virtual machine uses local variables to pass parameters on method invocation. On class method invocation any parameters are passed in consecutive local variables starting from local variable 0.

C# Hello Program:


using System;

class Hello
{

public static void Main(string[] args)
{
Console.WriteLine(“Hello”);
}

}


Disassembled C# Hello Program :


.method private hidebysig static void Main() cil managed
{
.entrypoint
// Code size 11 (0xb)
.maxstack 8
IL_0000: ldstr “Hello”
IL_0005: call void [mscorlib]System.Console::WriteLine(class System.String)
IL_000a: ret
} // end of method Hello::Main


Explanation of C# .Net Program:

The first line defines the Main method by using the .method MSIL keyword. Note that the method is defined as being public and static, which are the default modifiers for the Main method. Also note that this method is defined as managed.

The next line of code uses the MSIL .entrypoint keyword to designate this particular method as the entry point to the application. When the .NET runtime executes this application, this is where control will be passed to the program.

Next, look at the MSIL opcodes on lines IL_0000 and IL_0005. The first uses the the ldstr (Load String) opcode to load a hard-coded literal (“Hello”) onto the stack.

The next line of code calls the System.Console.WriteLine method. Notice that the MSIL prefixes the method name with the name of the assembly that defines the method. This line also tells us the number of arguments (and their types) that are expected by the method. Here, the method will expect a System.String object to be on the stack when it’s called. Finally, line IL_000a is a simple ret MSIL opcode to return from the method.

C# .Net Tutorial Attributes

This article deals with the new C# .net features named attributes. This can be used as a Self paced C# .net training or C# tutorial material.

C# .net Attributes provide a powerful method of associating declarative information with C# code. It can work with types, methods, properties and other language components.

As a thumb-rule, a class is said to be an attribute class, if it directly or indirectly derives from the System.Attribute class.

Attributes in C# .net – Sample:

This C# tutorial on attributes uses custom attributes by defining an attribute class. The class ProductInfo is derived from System.Attribute class. This class contains information about application author and product version. The created class is:


[AttributeUsage(AttributeTargets.All)]
public class ProductInfo : System.Attribute
{

public ProductInfo(double version,string name)
{

m_version = version;
m_authorName = name;

}

public double Version
{

get { return m_version; }

}

private double m_version = 1.00;
public string AuthorName
{

get { return m_authorName; }

}

private string m_authorName;

}


There is a new item in the first line of code. It is an attribute that allows us to use a c# attribute keyword AttributeUsage to all elements of our class (parameter AttributeTargets.All).

Now we can use these c# attributes in declaration of any other custom class:

[ProductInfo(1.005,”CoderSource”])

public class AnyClass { }

After creating simple custom type c# attributes, we can get information about all its attributes at runtime. Such run-time information can be pulled out from a class by using the namespace System.Reflection. In main function of our C# tutorial program we can get from our object all information about custom attributes. Next piece of code demonstrate it:


MemberInfo membInfo;
membInfo = typeof(AnyClass);
object [] attributes;
attributes = membInfo.GetCustomAttributes(typeof(ProductInfo),true);

if(attributes.GetLength(0)!=0)
{

ProductInfo pr = (ProductInfo) attributes[0];
Console.WriteLine(“Product author: {0}”,pr.AuthorName);
Console.WriteLine(“Product version; {0}”,pr.Version);

}


Using standard methods of getting type information and members of his type we can get all custom attributes.

The example in this C# tutorial uses only one of three library c# attributes (AttributeUsage). It uses only with declaration some custom c# attribute. There is another standard attribute called Conditional. It calls method with that attribute only in case when method application has some needed condition. Most often it is used to define some preprocessor commands (“#”). And the last reserved attribute is Obsolete. It helps us to mark some elements of program that are old to use. These are basic things about attributes and their usage in applications.

The attached example can be compiled using Microsoft .NET framework in command line. In the command prompt, type next: csc attributes.cs. The C# compiler must create an executable version of it, since you can run it. Please download the c# sample code from here.

6 C# .Net Tutorial Exceptions

This article is a basic c# tutorial dealing with exceptions in c# .net.

Practically any program including c# .net can have some amount of errors. They can be broadly classified as compile-time errors and runtime errors. Compile-time errors are errors that can be found during compilation process of source code. Most of them are syntax errors. Runtime errors happen when program is running.

It is very difficult to find and debug the run-time errors. These errors also called exceptions. Rules of good coding style say that program must be able to handle any runtime error. Exception generates an exception call at runtime. Exceptions in C# can be called using two methods:

  1. Using the throw operator. It call the manage code anyway and process an exception.
  2. If using the operators goes awry, it can generate an exception.

Simple Exceptions – .Net C# Tutorial:

C# language uses many types of exceptions, which are defined in special classes. All of them are inherited from base class named System.Exception. There are classes that process many kinds of exceptions: out of memory exception, stack overflow exception, null reference exception, index out of range exception, invalid cast exception, arithmetic exception etc. This c# tutorial deals with DivideByZero c# exception and custom classes in c# exceptions.

C# has defined some keywords for processing exceptions. The most important are try, catch and finally.

The first one to be known is the try operator. This is used in a part of code, where there exists a possibility of exception to be thrown. But operator “try” is always used with the operators: catch and finally.

See the following example of handling a simple exception in c#.


//Sample code for C# Exception tutorial using try , catch

// try catch exception
int zero = 0;
try
{
int div = 100/zero;
}
catch(DivideByZeroException)
{
Console.WriteLine(“Division by zero exception passed”);
}


This code in runtime throws a DivideByZeroException and writes some message through the console. But if you want to release some resources that were created you must use try – finally construction. Finally will be called even if there were no exceptions raised.


//Sample code for C# Exception tutorial using try, finally

Bitmap bit = null;
// try finally exception
try
{

bit = new Bitmap(100,100);

}
finally
{

bit.Dispose();
Console.WriteLine(“bitmap is disposed”);

}


In the similar way we can use try – catch – finally construction. The attached c# tutorial program contains sample including all the three.

Custom Exception Classes – C# Tutorial:

Some larger projects might have a requirement of creating their own custom exception classes. Let us try to create class that validates email address. It will validate for the “@” symbol. Please have a look on the following piece of code:


//Sample code for C# .Net Exception tutorial – validates an email address

public class TextException : Exception
{

public TextException() : base()<br>
{
}

public TextException(string message) : base(message)
{

}

}

public class MailValidator
{

MailValidator()
{

}

private static char symbol = ‘@’;

public static void TestEnteredMail(string mailAddress)
{

if(mailAddress.IndexOf(symbol)==-1)
{

Console.WriteLine(“The string entered is not a valid email address”);<br>
throw(new TextException());

}

}

}


Here were created a C# .Net TextException class that inherits from System.Exception class of .NET class library. Actually it does nothing, but there is an additional class MailValidator. It has TestEnteredMail method that raises a TextException. Now look at usage of it in Main function.


try
{

MailValidator.TestEnteredMail(Console.ReadLine());

}
catch(TextException)
{

Console.WriteLine(“Exception was passed”);

}


7 C# .Net Tutorial Interfaces

This C# Tutorial deals with interfaces in C# .Net. An Interface is a reference type and it contains only abstract members. Interface’s members can be Events, Methods, Properties and Indexers. But the interface contains only declaration for its members. Any implementation must be placed in class that realizes them. The interface can’t contain constants, data fields, constructors, destructors and static members.

Interface Sample – C# Tutorial:

Let us look at a simple example for c# interfaces. In this example interface declare base functionality of node object.


interface INode
{

string Text
{
get;
set;
}

object Tag
{
get;
set;
}

int Height
{
get;
set;
}

int Width
{
get;
set;
}

float CalculateArea();

}


The following code defines some more get and set properties such as Text, Tag, Height, Width and a method CalculateArea. So, we only define some members of interface now we need to create a class that inherits base functionality of INode interface.


//Deriving a class using a C# .Net interface –  c# tutorial

public class Node : INode
{

public Node()
{}

public string Text
{

get
{
return m_text;
}

set
{
m_text = value;
}

}

private string m_text;
public object Tag
{

get
{
return m_tag;
}

set
{
m_tag = value;
}

}

private object m_tag = null;
public int Height
{

get
{
return m_height;
}

set
{
m_height = value;
}

}

private int m_height = 0;
public int Width
{

get
{
return m_width;
}

set
{
m_width = value;
}

}

private int m_width = 0;

public float CalculateArea()
{

if((m_width<0)||(m_height<0))
return 0;

return m_height*m_width;

}

}


Now the above code has created a c# class Node that inherits from INode c# interface and implement all its members. A very important point to be remembered about c# interfaces is, if  some interface is inherited, the program must implement all its declared members. Otherwise the c# compiler throws an error.

The above code was a simple example of c# interface usage. Now this has to be followed with some advanced details of interface building in C# .Net. The previous example used only names of methods or properties that have the same names as in interface. But there is another alternative method for writing the implementation for the members in class. It uses full method or property name e.g. INode.CalculateArea () {// implemetation}.

Multiple Inheritance using C# interfaces – C# Tutorial:

Next feature that obviously needs to be explained is multiple inheritance using c# interfaces. This can be done using child class that inherits from any amount of c# interfaces. The inheritance can also happen with a combination of a C# .Net class and c# interfaces. Now let us see a small piece of code that demonstrate us multiple inheritance using only interfaces as parent data types.


class ClonableNode : INode,ICloneable
{

public object Clone()
{
return null;
}

// INode members
}


The above example created a class ClonableNode. It implements all the functionality of INode interface in the same way as it was done in Node class. Also it realizes Clone method – only one item of IClonable interface of .NET library.

is Operator for C# .Net interfaces – C# Tutorial:

At last a new C# operator that can be used to define that class should be explained. It is the “is” operator. Please have a look at the following piece of code:


if(nodeC is INode)
Console.WriteLine(“nodeC is object of INode type”);
else
Console.WriteLine(“nodeC isn’t object of INode type”);


In example nodeC – object that was created as ClonableNode type, but when we run program “if operator” returns true. It means that nodeC also has INode type. Ultimately the main objective of the usage of C# .Net interfaces is to divide the definition from implementation.

8 C# .Net Tutorial Multithreading

Any Windows application must have one or more processes. A Process is structural unit with a memory block and using some set of resources. For each executable, the Windows operating system creates some isolated memory block. This C# .Net Tutorial tries to explain the basics of Multithreading in C# .Net.

Every process must have at least one thread. The first thread is created with a process and is known as primary thread. This Primary Thread is entry point of application. In traditional Windows applications it is the method WinMain() and in console applications it is named main().

Main goal of creating multithreading application is performance improvement. As an example, imagine a situation where in a user starts a long process (e.g. copying), he can’t use a single threaded application and wait for an infinite time for the operation to get completed. But if he uses multi–threading application he can set copying process in the background and interact with application without any problems.

At first, if one wants to create a multi-threaded application an important point to be remembered is, a global variable, which is being accessed by different threads, can try to modify the same variable. This is a generic problem, which is solved using a mechanism called Synchronization of threads. Synchronization is nothing but the process of creating some set of rules to operate data or resources.

The C# .Net language has a powerful namespace which can be used for programming with Threads as well as Thread Synchronization in C# .Net programming. The name of the namespace is Sytem.Threading. The most important class inside this namespace for manipulating the threads is the C# .Net class Thread. It can run other thread in our application process.

Sample program on C# Multithreading – C# Tutorial:

The example it creates an additional C# .Net class Launcher. It has only one method, which output countdown in the console.


//Sample for C# tutorial on Multithreading using lock

public void Coundown()
{

lock(this)
{

for(int i=4;i>=0;i–)
{

Console.WriteLine(“{0} seconds to start”,i);

}

Console.WriteLine(“GO!!!!!”);

}

}


There is a new keyword lock inside the above chunk of .Net C# tutorial code. This provides a mechanism for synchronizing the thread operation. It means at the same point of time only one thread can access to this method of created object. Unless the lock is released after completion of the code, the next routine or iteration cannot enter the block.

To understand it more clearly please have a look at the piece of main method’s code:


Launcher la = new Launcher();

Thread firstThread = new Thread(new ThreadStart(la.Coundown));
Thread secondThread =new Thread(new ThreadStart(la.Coundown));
Thread thirdThread = new Thread(new ThreadStart(la.Coundown));

firstThread.Start();
secondThread.Start();
thirdThread.Start();


As you see there were created three additional threads. These threads start a method of object that has Launcher type. The above program is a very simple example of using multi-threading in C#. Net. But C# .Net allows us to create more powerful applications with any level of complexity.

9 C# .Net Tutorial Reflection

Reflection is the feature in .Net, which enables us to get some information about object in runtime. That information contains data of the class. Also it can get the names of the methods that are inside the class and constructors of that object.

To write a C# .Net program which uses reflection, the program should use the namespace System.Reflection. To get type of the object, the typeof operator can be used. There is one more method GetType(). This also can be used for retrieving the type information of a class. The Operator typeof allow us to get class name of our object and GetType() method uses to get data about object’s type. This C# tutorial on reflection explains this feature with a sample class.


public class TestDataType
{

public TestDataType()
{
counter = 1;
}

public TestDataType(int c)
{
counter = c;
}

private int counter;

public int Inc()
{
return counter++;
}
public int Dec()
{
return counter–;
}

}


At first we should get type of object that was created. The following C# .Net code snippet shows how to do it.

TestDataType testObject = new TestDataType(15);
Type objectType = testObject.GetType();
Now objectType has all the required information about class TestDataType. We can check if our class is abstract or if it is a class. The System.Type contains a few properties to retrieve the type of the class: IsAbstract, IsClass. These functions return a Boolean value if the object is abstract or of class type. Also there are some methods that return information about constructors and methods that belong to the current type (class). It can be done in a way as it was done in next example:


Type objectType = testObject.GetType();

ConstructorInfo [] info = objectType.GetConstructors();
MethodInfo [] methods = objectType.GetMethods();

// get all the constructors
Console.WriteLine(“Constructors:”);
foreach( ConstructorInfo cf in info )
{
Console.WriteLine(cf);
}

Console.WriteLine();
// get all the methods
Console.WriteLine(“Methods:”);
foreach( MethodInfo mf in methods )
{
Console.WriteLine(mf);
}


Now, the above program returns a list of methods and constructors of TestDataType class.

Reflection is a very powerful feature that any programming language would like to provide, because it allows us to get some information about objects in runtime. It can be used in the applications normally but this is provided for doing some advanced programming. This might be for runtime code generation (It goes through creating, compilation and execution of source code in runtime).

9.Net  COM Interop

The ultimate goal of COM Interop is to provide access to the existing COM components without requiring that the original component be modified. This tries to make the .NET types equivalent to the COM Types.

COM Interop and Marshalling:

COM had a mechanism of marshalling and un-marshalling to convert between the source and target data types. This is now totally covered in COM Interop using RCW or Runtime Callable Wrapper. This automatically converts the .Net data types to the corresponding COM data types.

RegAsm and tlbimp in COM Interop

In addition, COM Interop allows COM developers to access managed objects as easily as they access other COM objects. It provides a specialized utility (RegAsm.exe) that exports the managed types into a type library and registers the managed component as a traditional COM component. At run time, the common language runtime marshals data between COM objects and managed objects as needed.

This tutorial shows how C# can use COM objects to develop applications. First thing to be done is the creation of wrapper class for the selected COM object. It can be done manually through the command line application called TlbImp.exe (Type Library Importer). This utility converts COM type library to the .NET Framework metadata. This procedure can be done automatically through the .NET environment. We just need to add reference to the COM object to our C# project. So, type in .NET command line next string: tlbimp $WindowsPath$\system32\quartz.dll /out: quartzNET.dll. This command will create a new dll with types that are compatible with any of Managed .NET languages. Now we can add this dll to our C# project or compile our C# file with additional feature “/r quartzNET.dll”.

The following is an example of usage this COM object in C# managed code:


quartzNET.FilgraphManager manager = new quartzNET.FilgraphManagerClass();
quartzNET.IMediaControl mc = (quartzNET.IMediaControl)manager;
mc.RenderFile(args[0]);
mc.Run();
// Wait for completion.
Console.WriteLine(“Press Enter to continue.”);
Console.ReadLine();


Here is the MediaControl object, which was created in COM. This application gets a name of video file that we want to play from command line and shows it. So, this is a simple example of usage COM Interop. To compile an attached example we just need this quartzNET.dll (is attached too) and .NET command line. Type here next command csc InteropSample.cs /r:quartzNET.dll. It must create an executable file, but it can be run using command line, just type InteroPsample.exe some.avi. So, it opens a console application and also runs a standard Windows media player control to play the video.

10. Methods and properties

Any class in an object-oriented language has method and property members. These are the places where the actual business logic or functionality is written and executed. This tutorial explains how to create and use methods and properties in C#.

C# Methods:

Method is object-oriented item of any language. All C# programs are constructed from a number of classes and almost all the classes will contain methods. A class when instantiated is called an object. Object-oriented concepts of programming say that the data members of each object represent its state and methods represent the object behavior.

Method Signature in C#:

Each method is declared as follows:

Return-type methodname ( Parameterslist );

For better understanding of methods let consider following example. We have a class Man. It can have many fields like that:

public class Man
{
public Man(){}
private int m_old;
private string m_name;
public string WhatIsYourName()
{
Console.WriteLine(m_name);
return m_name;
}
public string HowOldAreYou()
{
Console.WriteLine(m_old.ToString());
return m_old;
}
}

The private members m_old and m_name define some state of objects that can be created as instances of our class. Also the class Man has two methods, which serve some of our requests. Method string WhatIsYourName() writes current object’s name to the console and returns it, and the second one similar to first return age of man and also writes an output to the console.

The return type in the example above returns strings, which is an in-built data type. The methods can also return any generic C# type or any custom types created by us.

Passing Parameters to Methods in C#:

The input parameters can be passed in two ways.

  • Value type
  • Reference type.

If parameters are passed as value types a new copy of it will be created and passed inside the function. If they are created as reference types, only the address of the parameters will be passed.

See next example:

public int CalculateBirthYear(int year)
{
int b = year – m_old;
Console.WriteLine(“Birth year is {0}”,b);
return b;
}

If input parameter pass as reference type it must use keyword ref, in that way we operate with the same cell in memory. That’s mean it can be changed inside any method. A small example for a parameter passed by reference is:

public int CalculateBirthYear(ref int year)
{
int b = year – m_old;
Console.WriteLine(“Birth year is {0}”,b);
return b;
}

Now, the function CalculateBirthYear can even modify the value of year as it is passed by reference.

Output Parameters in Methods:

The return values in any function will be enough for any one if only one value is needed. But in case a function is required to return more than one value, then output parameters are the norm. This is not supported in C++ though it can be achieved by using some programming tricks. In C# the output parameter is declared with the keyword out before the data type. A typical example is as follows.

public void CalculateBirthYear(ref int year, out int birthyear)
{
int b = year – m_old;
Console.WriteLine(“Birth year is {0}”,b);
birthyear = b;
return;
}

Strictly speaking there is no difference between ref and out parameters. The only difference is that the ref input parameters need an input value and the out parameters don’t.

Variable arguments in C#:

The C# language supports variable arguments through a keyword called params. A typical example for the declaration of a function with variable argument signature is as follows.

Public void functionName(int a, params int[] varParam);

Method Overloading in C#:

A method is considered to be an overloaded method, if it has two or more signatures for the same method name. These methods will contain different parameters but the same return types.

A simple example for an overloaded methods are:

Public void functionName(int a, params int[] varParam);
Public void functionName(int a);

Property in C#:

Property – it is a special method that can return a current object’s state or set it. Simple syntax of properties can see in the following example:

public int Old
{
get {return m_old;}
set {m_old = value;}
}
public string Name
{
get {return m_name;}
}

Here are two types of properties. A first one can set or get field of class named m_old, and the second is read only. That’s mean it can only get current object’s state.

The significance of these properties is its usability. These properties need not be called with any function names like objectname.get or objectname.set etc., But they can be directly assigned the values or retrieve the values.

A usage of method and properties you can see in attached example. It can be compiled using MS Visual Studio command line. Do get an executable file must do the next steps: In Start menu of Windows should find Programs->MS Visual Studio .NET->MS Visual Studio .NET Tools-> Visual Studio .NET Command Prompt. So run it, and type csc. It is command that run csharp compiler, after it, should type a lines similar to it: csc /out: My.exe My.cs. Now we can run our program from exe file.

And at least few words about access modifiers of methods and properties. In examples there’re only public, but they can also be declared as private, protected or internal. Also aditional modifiers are new, static, virtual, abstract, override. All these will be dealt in next tutorials.Download the sample code from here.

11 .Net Framework basics

When we speak about .Net, we mean by .NET framework. .NET Framework is made up of the Common Language Runtime (CLR), the Base Class Library (System Classes). This allows us to build our own services (Web Services or Windows Services) and Web Applications (Web forms Or Asp .Net), and Windows applications (Windows forms). We can see how this is all put together.

Above Picture shows overall picture, demonstrating how the .NET languages follows rules provided by the Common Language Specifications (CLS). These languages can all be used Independently to build application and can all be used with built-in data describers (XML) and data assessors (ADO .NET and SQL). Every component of the .NET Framework can take advantage of the large pre- built library of classes called the Framework Class Library (FCL). Once everything is put together, the code that is created is executed in the Common Language Runtime. Common Language Runtime is designed to allow any .NET-compliant language to execute its code. At the time of writing, these languages included VB .Net, C# and C++ .NET, but any language can become .NET- compliant, if they follow CLS rules. The following sections will address each of the parts of the architecture.

.Net Common Language Specifications (CLS):

In an object-oriented environment, everything is considered as an object. (This point is explained in this article and the more advanced features are explained in other articles.) You create a template for an object (this is called the class file), and this class file is used to create multiple objects.
TIP: Consider a Rectangle. You may want to create many Rectangle in your lifetime; but each Rectangle will have certain characteristics and certain functions. For example, each rectangle will have a specific width and color. So now, suppose your friend also wants to create a Rectangle. Why reinvent the Rectangle? You can create a common template and share it with others. They create the Rectangle based on your template. This is the heart of object-oriented programming—the template is the class file, and the Rectangle is the objects built from that class. Once you have created an object, your object needs to communicate with many other Objects.

Even if it is created in another .NET language doesn’t matter, because each language follows the rules of the CLS. The CLS defines the necessary things as common variable types (this is called the Common Type System CTS ), common visibility like when and where can one see these variables, common method specifications, and so on. It doesn’t have one rule which tells how C# composes its objects and another rule tells how VB .Net does the same thing . To steal a phrase, there is now “One rule to bind them all.” One thing to note here is that the CLS simply provides the bare rules. Languages can adhere to their own specification. In this case, the actual compilers do not need to be as powerful as those that support the full CLS.

The Common Language Runtime (CLR):

The heart of .net Framework is Common Language Runtime (CLR). All .NET-compliant languages run in a common, managed runtime execution environment. With the CLR, you can rely on code that is accessed from different languages. This is a huge benefit. One coder can write one module in C#, and another can access and use it from VB .Net. Automatic object management, the .NET languages take care of memory issues automatically. These are the few listed benefits which you get from CLR.

Microsoft Intermediate Language (MSIL):

So how can many different languages be brought together and executed together? Microsoft Intermediate Language (MSIL) or, as it’s more commonly known, Intermediate Language (IL). In its simplest terms, IL is a programming language. If you wanted to, you could write IL directly, compile it, and run it. But why would want to write such low level code? Microsoft has provided with higher-level languages, such as C#, that one can use. Before the code is executed, the MSIL must be converted into platform-specific code. The CLR includes something called a JIT compiler in which the compiler order is as follows.

Source Code => Compiler => Assembley =>Class Loader =>Jit Compiler =>Manged Native Code=>Execution.

The above is the order of compilation and execution of programs. Once a program is written in a .Net compliant language, the rest all is the responsibility of the frame work

12 .Net C# Delegates and Events

This tutorial describes some basics about some of the additional features of C# language namely Delegates and Events. These new constructs are used in object-oriented programming languages like C# and Java.

Delegates in C# .Net:

In languages like C/C++ there is a feature called callback function. This feature uses Pointers to Functions to pass them as parameters to other functions . Delegate is a similar feature but it is more safe and possible to use in object-oriented constructions of programming language. It needs the method’s name and its parameters (input and output variables) when we create a delegate. But delegate is not a standalone construction it’s a class. Any delegate must be inherited from base delegate class of .NET class library. This is class called System.MultycastDelegate. So, now please have a look at the following example:


class Figure
{

public Figure(float a, float b, float c)
{
m_xPos = a;
m_yPos = b;
m_zPos = c;
}

public void InvertX()
{
m_xPos = – m_xPos;
}
public void InvertY()
{
m_yPos = – m_yPos;
}
public void InvertZ()
{
m_zPos = – m_zPos;
}

private float m_xPos = 0;
private float m_yPos = 0;
private float m_zPos = 0;

}


Now, we have a class named Figure and it has three private fields that use to store position and three methods to invert this position by every axis. In main class we declare delegate in the next way:

public delegate void FigureDelegate();

And now in the main function we should use it like this:


Figure figure = new Figure(10,20,30);

FigureDelegate fx = new FigureDelegate(figure.InvertX);
FigureDelegate fy = new FigureDelegate(figure.InvertY);
FigureDelegate fz = new FigureDelegate(figure.InvertZ);

MulticastDelegate f_del = fx+fy+fz;


In this example we create three delegates of FigureDelegate type and attach to these elements our three methods from Figure class. Now every delegate keeps the address of the attached function. The last line of code is very interesting, here we create a delegate of base type (MulyCastDelegate) and attach three of our already created delegates. It is an allowed construction because every delegate inherited from base type can keep more than one pointer to function. Since, it can contain a list of addresses to methods.

Events in C# .Net:

Delegate is a very useful construct in C# language as it can define and use function names at runtime not at compile time. But the main goal of using delegates is using them into events model. Events are the actions of the system on user manipulations (e.g. mouse clicks, key press, timer etc.). To understand the usage of delegates for event model, the previous examples are used here. We should add to our Figure class next things:


public delegate void FigureHandler(string msg);

public static event FigureHandler Inverted;

public void InvertZ()
{
m_zPos = – m_zPos;
Inverted(“inverted by z-axis”);
}


Now we have a delegate declared and event that uses this delegate’s type. In every function we should call our event. It is not yet very clear why we should use such construction, but the next code snippet should explain it clearly:


static void Main(string[] args)
{

Figure figure = new Figure(10,20,30);

Figure.Inverted+=new Test.Figure.FigureHandler(OnFigureInverted);

figure.InvertX();

figure.InvertZ();

}

private static void OnFigureInverted(string msg)
{
Console.WriteLine(“Figure was {0}”,msg);
}


So, in the main function we should create an object of figure class and attach event handler to the method OnFigureInverted. And when we call any of invert methods the event is fired and it calls our event handler. The application will print the following string into the console:

Figure was inverted by x-axis
Figure was inverted by z-axis

There was simple examples of using delegates and events and should be treated as a starting point to learn it more yourself.

13. All about Unsafe Code in C#

C# .net hides most of memory management, which makes it much easier for the developer. Thanks for the Garbage Collector and the use of references. But to make the language powerful enough in some cases in which we need direct access to the memory, unsafe code was invented.

Commonly while programming in the .net framework we don’t need to use unsafe code, but in some cases there is no way not to, such as the following:

  • Real-time applications, we might need to use pointers to enhance performance in such applications.
  • External functions, in non-.net DLLs some functions requires a pointer as a parameter, such as Windows APIs that were written in C.
  • Debugging, sometimes we need to inspect the memory contents for debugging purposes, or you might need to write an application that analyzes another application process and memory.

Unsafe code is mostly about pointers which have the following advantages and disadvantages.

Advantages of Unsafe Code in C#:

  • Performance and flexibility, by using pointer you can access data and manipulate it in the most efficient way possible.
  • Compatibility, in most cases we still need to use old windows APIs, which use pointers extensively. Or third parties may supply DLLs that some of its functions need pointer parameters. Although this can be done by writing the DLLImport declaration in a way that avoids pointers, but in some cases it’s just much simpler to use pointer.
  • Memory Addresses, there is no way to know the memory address of some data without using pointers.

Disadvantages of Unsafe Code in C#:

  • Complex syntax, to use pointers you need to go throw more complex syntax than we used to experience in C#.
  • Harder to use, you need be more careful and logical while using pointers, miss using pointers might lead to the following:
    • Overwrite other variables.
    • Stack overflow.
    • Access areas of memory that doesn’t contain any data as they do.
    • Overwrite some information of the code for the .net runtime, which will surely lead your application to crash.
  • Your code will be harder to debug. A simple mistake in using pointers might lead your application to crash randomly and unpredictably.
  • Type-safety, using pointers will cause the code to fail in the .net type-safety checks, and of course if your security police don’t allow non type-safety code, then the .net framework will refuse to execute your application.

After we knew all the risks that might face us while using pointer and all the advantages those pointers introduces us of performance and flexibility, let us find now how to use them. The keyword unsafe is used while dealing with pointer, the name reflects the risks that you might face while using it. Let’s see where to place it. We can declare a whole class as unsafe:

unsafe class Class1
{
//you can use pointers here!
}

Or only some class members can be declared as unsafe:

class Class1
{
//pointer
unsafe int * ptr;
unsafe void MyMethod()
{
//you can use pointers here
}
}

The same applies to other members such as the constructor and the properties.

To declare unsafe local variables in a method, you have to put them in unsafe blocks as the following:

static void Main()
{
//can’t use pointers here

unsafe
{
//you can declare and use pointer here

}

//can’t use pointers here
}

You can’t declare local pointers in a “safe” method in the same way we used in declaring global pointers, we have to put them in an unsafe block.

static void Main()
{
unsafe int * ptri; //Wrong
}

If you got too excited and tried to use unsafe then when you compile the code just by using

csc test.cs

You will experience the following error:

error CS0227: Unsafe code may only appear if compiling with /unsafe

For compiling unsafe code use the /unsafe

csc test.cs /unsafe

In VS.net go to the project property page and in “configuration properties>build” set Allow Unsafe Code Blocks to True.

After we knew how to declare a block as unsafe we should now learn how to declare and use pointers in it.

Declaring pointers

To declare a pointer of any type all what you have to do is to put ‘*’ after the type name such as

int * ptri;

double * ptrd;

NOTE: If you used to use pointer in C or C++ then be careful that in C# int * ptri, i; ‘*’ applies to the type itself not the variable so ‘i’ is a pointer here as well, same as arrays.

void Pointers

If you want to declare a pointer, but you do not wish to specify a type for it, you can declare it as void.

void *ptrVoid;

The main use of this is if you need to call an API function than require void* parameters. Within the C# language, there isn’t a great deal that you can do using void pointers.

Using pointers

Using pointers can be demonstrated in the following example:

static void Main()
{

int var1 = 5;

unsafe
{
int * ptr1, ptr2;
ptr1 = &var1;
ptr2 = ptr1;
*ptr2 = 20;
}

Console.WriteLine(var1);
}

The operator ‘&’ means “address of”, ptr1 will hold the address of var1, ptr2 = ptr1 will assign the address of var1, which ptr1 was holding, to ptr2. Using ‘*’ before the pointer name means “the content of the address”, so 20 will be written where ptr2 points.

Now var1 value is 20.

sizeof operator

As the name says, sizeof operator will return the number of bytes occupied of the given data type

unsafe
{
Console.WriteLine(“sbyte: {0}”, sizeof(sbyte));
Console.WriteLine(“byte: {0}”, sizeof(byte));
Console.WriteLine(“short: {0}”, sizeof(short));
Console.WriteLine(“ushort: {0}”, sizeof(ushort));
Console.WriteLine(“int: {0}”, sizeof(int));
Console.WriteLine(“uint: {0}”, sizeof(uint));
Console.WriteLine(“long: {0}”, sizeof(long));
Console.WriteLine(“ulong: {0}”, sizeof(ulong));
Console.WriteLine(“char: {0}”, sizeof(char));
Console.WriteLine(“float: {0}”, sizeof(float));
Console.WriteLine(“double: {0}”, sizeof(double));
Console.WriteLine(“decimal: {0}”, sizeof(decimal));
Console.WriteLine(“bool: {0}”, sizeof(bool));

//did I miss something?!
}

The output will be:

sbyte: 1

byte: 1

short: 2

ushort: 2

int: 4

uint: 4

long: 8

ulong: 8

char: 2

float: 4

double: 8

decimal: 16

bool: 1

Great, we don’t have to remember the size of every data type anymore!

Casting Pointers

A pointer actually stores an integer that represents a memory address, and it’s not surprising to know that you can explicitly convert any pointer to or from any integer type. The following code is totally legal.

int x = 10;
int *px;

px = &x;
uint y = (uint) px;
int *py = (int*) y;

A good reason for casting pointers to integer types is in order to display them. Console.Write() and Console.WriteLine() methods do not have any overloads to take pointers. Casting a pointer to an integer type will solve the problem.

Console.WriteLine(“The Address is: “ + (uint) px);

As I mentioned before, it’s totally legal to cast a pointer to any integer type. But does that really mean that we can use any integer type for casting, what about overflows? On a 32-bit machine we can use uint, long and ulong where an address runs from zero to about 4 billion. And on a 64-bit machine we can only use ulong. Note that casting the pointer to other integer types is very likely to cause and overflow error. The real problem is that checked keyword doesn’t apply to conversions involving pointers. For such conversions, exceptions wont be raised when an overflow occur, even in a checked context. When you are using pointers the .net framework will assume that you know what you’re doing and you’ll be happy with the overflows!

You can explicitly convert between pointers pointing to different types. For example:

byte aByte = 8;
byte *pByte = &aByte;
double *pDouble = (double*) pByte;

This is perfectly legal code, but think twice if you are trying something like that. In the above example, the double value pointed to by pDouble will actually contain a byte (which is 8), combined by an area of memory contained a double, which surely won’t give a meaningful value. However, you might want to convert between types in order to implement a union, or you might want to cast pointers to other types into pointers to sbyte in order to examine individual bytes of memory.

Pointers Arithmetic

It’s possible to use the operators +, -, +=, -=, ++ and — with pointers, with a long or ulong on the right-hand side of the operator. While it’s not permitted to do any operation on a void pointer.

For example, suppose you have a pointer to an int, and you want to add 1 to it. The compiler will assume that you want to access the following int in the memory, and so will actually increase the value by 4 bytes, the size of int. If the pointer was pointing to a double, adding 1 will increase its value by 8 bytes the size of a double.

The general rule is that adding a number X to a pointer to type T with a value P gives the result P + X *sizeof(T).

Let’s have a look at the following example:

uint u = 3;
byte b = 8;
double d = 12.5;
uint *pU = &u;
byte *pB = &b;
double *pD = &d;

Console.WriteLine(“Before Operations”);
Console.WriteLine(“Value of pU:” + (uint) pU);
Console.WriteLine(“Value of pB:” + (uint) pB);
onsole.WriteLine(“Value of pD:” + (uint) pD);

pU += 5;
pB -= 3;
pD++;

Console.WriteLine(“\nAfter Operations”);
Console.WriteLine(“Value of pU:” + (uint) pU);
Console.WriteLine(“Value of pB:” + (uint) pB);
Console.WriteLine(“Value of pD:” + (uint) pD);

The result is:
Before Operations
Value of pU:1242784
Value of pB:1242780
Value of pD:1242772
After Operations
Value of pU:1242804
Value of pB:1242777
Value of pD:1242780

5 * 4 = 20, where added to pU.

3 * 1 = 3, where subtracted from pB.

1 * 8 = 8, where added to pD.

We can also subtract one pointer from another pointer, provided both pointers point to the same date type. This will result a long whose value is given by the difference between the pointers values divided by the size of the type that they represent:

double *pD1 = (double*) 12345632;
double *pD2 = (double*) 12345600;
long L = pD1 – pD2; //gives 4 =32/8(sizeof(double))

Note that the way of initializing pointers in the example is totally valid.

Pointers to Structs and Class members

Pointers can point to structs the same way we used before as long as they don’t contain any reference types. The compiler will result an error if you had any pointer pointing to a struct containing a reference type.

Let’s have an example,

Suppose we had the following struct:

struct MyStruct
{
public long X;
public double D;
}

Declaring a pointer to it will be:

MyStruct *pMyStruct;

Initializing it:

MyStruct myStruct = new MyStruct();
pMyStruct = & myStruct;

To access the members:

(*pMyStruct).X = 18;
(*pMyStruct).D = 163.26;

The syntax is a bit complex, isn’t it?

That’s why C# defines another operator that allows us to access members of structs through pointers with a simpler syntax. The operator “Pointer member access operator” looks like an arrow, it’s a dash followed by a greater than sign: ->

pMyStruct->X = 18;
pMyStruct->D = 163.26;

That looks better!

Fields within the struct can also be directly accessed through pointer of their type:

long *pL = &(myStruct.X);
double *pD = &(myStruct.D);

Classes and pointers is a different story. We already know that we can’t have a pointer pointing to a class, where it’s a reference type for sure. The Garbage Collector doesn’t keep any information about pointers, it’s only interested in references, so creating pointers to classes could cause the Garbage Collector to not work probably.

On the other hand, class members could be value types, and it’s possible to create pointers to them. But this requires a special syntax. Remember that class members are embedded in a class, which sets in the heap. That means that they are still under the control of the Garbage Collector, which can at any time decide to move the class instance to a new location. The Garbage Collector knows about the reference, and will update its value, but again it’s not interested in the pointers around, and they will still be pointing to the old location.

To avoid the risk of this problem, the compiler will result an error if you tried to create pointers pointing to class members in the same way we are using up to now.

The way around this problem is by using the keyword “fixed”. It marks out a block of code bounded by braces, and notifies the Garbage Collector that there may be pointers pointing to members of certain class instances, which must not be moved.

Let’s have an example,

Suppose the following class:
class MyClass
{
public long X;
public double D;
}

Declaring pointers to its members in the regular way is a compile-time error:

MyClass myClass = new MyClass();

long *pX = &(myClass.X); //compile-time error.

To create pointers pointing to class members by using fixed keyword:

fixed (long *pX = &(myClass.X))
{

// use *pX here only.
}

The variable *pX is scoped within the fixed block only, and tells the garbage collector that not to move “myClass” while the code inside the fixed block.

stackalloc

The keyword “stackalloc” commands the .net runtime to allocate a certain amount of memory on the stack. It requires two things to do so, the type (value types only) and the number of variables you’re allocating the stack for. For example if you want to allocate enough memory to store 4 floats, you can write the following:

float *ptrFloat = stackalloc float [4];

Or to allocate enough memory to store 50 shorts:

short *ptrShort = stackalloc short [50];

stackalloc simply allocates memory, it doesn’t initialize it to any value. The advantage of stackalloc is the ultra-high performance, and it’s up to you to initialize the memory locations that were allocated.

A very useful place of stackalloc could be creating an array directly in the stack. While C# had made using arrays very simple and easy, it still suffers from the disadvantage that these arrays are actually objects instantiated from System.Array and they are stored on the heap with all of the overhead that involves.

To create an array in the stack:

int size;
size = 6; //we can get this value at run-time as well.
int *int_ary = stackalloc int [size];

To access the array members, it’s very obvious to use *(int_ary + i), where “i “is the index. But it won’t be surprising to know that it’s also possible to use int_ary[i].

*( int_ary + 0) = 5; //or *int_ary = 5;
*( int_ary + 1) = 9; //accessing member #1
*( int_ary + 2) = 16;

int_ary[3] = 19; //another way to access members
int_ary[4] = 7;
int_ary[5] = 10;

In a usual array, accessing a member outside the array bounds will cause an exception. But when using stackalloc, you’re simply accessing an address somewhere on the stack; writing on it could cause to corrupt a variable value, or worst, a return address from a method currently being executed.

int[] ary = new int[6];
ary[10] = 5;//exception thrown

int *ary = stackalloc int [6];
ary[10] = 5;// the address (ary + 10 * sizeof(int)) had 5 assigned to it.

This takes us to the beginning to the article; using pointer comes with a cost. You have to be very certain of what you’re doing, any small error could cause very strange and hard to debug run-time bugs.

Conclusion

Microsoft had chosen the term “unsafe” to warn programmers of the risk they will go throw after typing that word. Using pointers as we had seen in this article has a lot of advantages from flexibility to high performance, but a very small error, even a simple typing mistake, might cause your entire application to crash. Worst, this could happen randomly and unpredictably, and will make debugging a very harder task to do.

14. Using Microsoft Speach Agent in C#

With millions of Web sites out there on the Internet, you’re going to have to think up some pretty innovative ways for your Web site to be noteworthy. And let’s face it, without a Web site that stands out, users won’t stick around and spend much time, or even come back.

You’re looking for more than just a pretty Web site—there are more important things than just luring people in. A lot of people just plain have trouble reading. And if your Web applications are speech enabled, users won’t have to do too much reading to benefit from the contents. Or how about users who might be seriously visually impaired or blind? By speech-enabling your Web site, even these people can benefit from the contents.

One more very powerful argument for using these Microsoft Agent technologies in your Web site is that they can provide not only speech, but also animation and interactivity. With them, your Web site visitors can get a guided tour of your site’s products and services. They can ask questions and get answers. They can point to the rise in sales for your company’s chart, which will be much more powerful than a plain graph.

The technology is actually built on ActiveX. Two ActiveX controls provide the Microsoft Agent character, animation, and speech capabilities. These controls are not server-side components, but they are embedded into the HTML and are invoked just as any other ActiveX controls would be in an HTML page.

You need to be aware of a few things before you use the Microsoft Agent components. The first is that the core components must be installed on the client computer before a speech-enabled Web site will work properly. The second thing you must make sure of is that the agent characters have been installed on the computer. And the third thing you must be sure about is that the speech module, which lets the Microsoft Agent components speak through your sound system, is installed.

You can download all the agent requirements from Microsoft Agent home Page:

15. Linked List Implementation in C#.NET

Last Updated: 11 January 2005

One of the challenges faced by an application designer is reading in large sequential files and having to also load the data in processing arrays as it is read. For example, let’s say we have a large data file and we have groupings within that file bounded by data indicators that signal a switching of the group. We read and process the file by group and when we are finished with the group, the data buffers must be refreshed. This data structure at first sounds like an array. However, here is the real issue: How big do we build the array? This is a question not easily answered when one cannot determine the “largest case” scenario for an array. And the issue grows even larger when we have a requirement to have what is known as a “jagged array” where dimensions are not always equal or uniform.

In this article, I will refer to an actual project where a large input file was opened up for sequential read and the data within that file was logically divided into groups with varying amounts of related data per group. For the data within the group, we might have to access any one of the rows to determine if it is there in order to make processing decisions. In the old days, we would try to estimate the maximum number of rows in a group and build the array that size. This can cause a couple of issues. First, we do not really know the maximum number of rows that will be in any given group — we can only estimate. If we estimate wrong, then this leads to an error where the array is not large enough. Second, even if we size it right, there is tremendous memory waste because the array area must be allocated in a fixed length. Figure 1 below shows this situation.

Figure 1

Figure 1 shows a diagram of a group structure if fixed length arrays are used. GP1 is the group header record from the sequential file. We need to store the record in the array because there are items of data needed within it and therefore we have to allocate a full array row for it. SGA, SGB, and SGC are all different segments types that are stored in the array within group GP1. The segment type is also like a sub-grouping within GP1 and their individual segments are suffixed with a number 1, 2, 3, etc. Every cell in the diagram marked “Unused” is just that. They are “Unused” because the number of elements in that particular segment are assigned a cell but the array is fixed length and we therefore have to allocate up to the maximum number of cells. Every cell marked “Unused” is a wasted memory area.

There are two data structures that were used to build a jagged array and solve these problems. The first structure was a linked-list implementation which had to be built through code and the second is a built-in C# structure known as an ArrayList. This article will discuss how that using a combination of the two structures will yield an overall structure as is shown in Figure 2:

Figure 2

The Linked List and ArrayList Structures

The linked list structure is basically a series of nodes with a “next” pointer that binds them like a chain. The advantages of this type of structure are:

  • Very fast
  • Can be quickly discarded and declared new (as opposed to clearing each element like in an array)
  • Uses only the memory needed — there is no memory waste

The disadvantages are:

  • Can be very cryptic and hard to understand

In our implementation, we use a C# structure known as an ArrayList to store the linked lists that will be the offshoot branches along the chain. Why do we use an ArrayList and not another linked list? Simple. It is for simplicity sake only. If we were to use a series of branches of linked lists, then we would have to have more complex implementations of the GoToNode method in the dll. We would have to traverse the tree to find a specific node on a sub branch. With an ArrayList, we simply push the entire sublist on the ArrayList and can pop it off with one index (as it will only hold one list in our implementation here).

Implementation: Building the DLL

The basic building block of a linked list is a structure known as the tree node. We started this application by building a .dll named LinkList.dll. Initially, the class declaration to define a node within the linked list is shown in figure 3. The squares in figure 2 above would be referred to as nodes.

public class LlNode // This is the node object within a linked list
{
public string status; // You can put a status indicator in here
public ArrayList minornode = new ArrayList(); // This is the offshoot branch
public string data; // This is the data buffer — put your data here
public LlNode next; // Will point to the next node in the list

}; // *** LlNode declaration ***

Figure 3

The field “status” is a string area that you can use for anything to indicate some kind of state information or anything you want for that matter. The data within this field exists for the life of the node. The next field within the class is of type ArrayList and is named minornode. Here, we will actually place another linked list as a branch node. The segments SGA, SGB, and SGC displayed in figure 2 actually start at minornode. The string area “data” is the actual data buffer and will hold the input record area. And finally, but very important, is the field next which is actually a pointer or references another node associated with the list to build.

Once we have the node defined, we want to create a class for the linked list data structure and methods. Our class will have the following methods:

LinkedList() This is the constructor method
GoToNode(int pos) This method is invoked to position to an absolute location (indicated by pos) on the linked list
PushNode(LlNode newnode, int pos) We use this method to push a new node onto the linked list (which in our implementation is somewhat like a stack in that nodes are placed on the list in a last-in-first-out method)
ListLen() Returns an integer value of the linked list length.
PrintList() To print the contents of an entire linked list on the console.
Exists(L1Node pnode, string srch, ref int pindex) This method will return a boolean to indicate if a type of data segment (i.e., SGA, SGB, etc.) exists on the linked list tree. This was used in the implementation for validating that required segments were there.

Remember, what we want is a way to read in an associated grouping of records into a jagged array, manipulate or extract the information from the array, and then dispose of the grouping in memory when finished. This is useful for processing situations like reading in all personal information for an account holder, getting the information you need, and then disposing of the structure. When using fixed-length arrays, you have to initialize each cell. This can be time consuming and resource intensive. With a linked list, all you have to do is reset or declare “new” the list array object.

Creating a New List

To instantiate a new linked list you must declare a type of LinkedList and construct the new object. For example, to declare and instantiate the LinkedList object MyList we would have the following code:

public static LinkedList MyList = new LinkedList();

Nodes are part of the list therefore we must instantiate a new node with the following code:

public static LlNode MyNode = new LlNode();

Building the List

When building the list, we work at the node level first. We fill the node with the input record data and then push it onto the list much the same way we would treat a stack structure. In following code example, we add the data from the input flat file record to the instance of node and then use the PushNode method to place it on the list:

LinkedList templist = new LinkedList();    // templist is a new linked list structure
LlNode node = new LlNode      // Llnode is the node that will get placed on the list
node.data = sInRec // the data buffer of node now has the input record data
templist.PushNode(node,0)  // Push the node on the list at the entry point

Processing the Records

Our sample file has the following records in it:

HDR 0100
GRP GROUP1
SGT SEGMENT A1
SGT SEGMENT A2
SGT SEGMENT A3
SGT SEGMENT A4
GRP GROUP2
SGT SEGMENT A1 G2
GRP GROUP3
SGT SEGMENT A1
SGT SEGMENT A2
SGT SEGMENT A3
SGT SEGMENT A4
SGT SEGMENT A5
SGT SEGMENT A6
SGT SEGMENT A7
SGT SEGMENT A8

This file, as listed above, has a header record (HDR), three group records (GRP), and various segments (SGT) per group.

Now, let’s trace the life cycle of a record from flat file to linked list to disposal. Looking at the program listing LinkedListMain.cs in the class MainClass, we see that the program has one argument on command line: the input text filename (above) which we named datafile.txt. The file is opened for input via the StreamReader class and read sequentially. For each record read, control is passed to the “Process” method:

StreamReader oInfile = File.OpenText(sInfilename);
sInRec = oInfile.ReadLine();
iState = HEADER;
while (sInRec != null) {
Process();
sInRec = oInfile.ReadLine();
}

The method “Process” starts by passing each record through a simple state machine. Basically, if the first three bytes of the record is “HDR” then this indicates an informational header record. The “state” becomes the constant HEADER. The code next looks for a record type (first 3 bytes in the record) of GRP. The state then becomes the constant GRP.

if (sType == “HDR”) {
iState = HEADER;
}
if (sType == “GRP” && iState == HEADER) {
MyNode.status = “Initialized”;
iState = GRP; }
else
if (sType == “GRP” && iState == GRP) {
MyNode.data = “MyNode”;
MyList.PushNode(MyNode, 0);
Console.WriteLine(“Show list”);
MyList.PrintList();
bExists = MyList.Exists(MyNode,”GRP”,ref pindex);
PrintMyList(0); // Print the grouping node
PrintMyList(1); // Print all segments in the group (branch)
}

At the occurrence of the GRP record, a new list node or LlNode is created. This will store the linked list (or really a pointer to the linklist stored in minornode) of all segments within the grouping. A temporary list named templist is created and a new instance of the list node area MyNode is declared. The data from the flat file record is placed in the MyNode.data buffer and the node is pushed on at position 0 in the linked list templist.

switch (sType) {
case “GRP”:
// We can put some checks here to make sure
// that we do not add more than one GRP per grouping

// This is the first segment in this loop and thus a
// new grouping List node is declared;
MyNode = new LlNode();

// node is the ListNode object where the segment’s data
// is stored.
LlNode node = new LlNode();

// templist will ultimately end up on the minornode ArrayList
templist = new LinkedList();
node.data = sInRec;

// add the data for this segment to templist
templist.PushNode(node,0);

// push templist onto the ArrayList minornode
MyNode.minornode.Add(templist);
MyNode.status = “nothing”;
break;

Figure 4 shows the structure of the list at the occurrence of a new grouping event:

Figure 4

After the GRP segment, the program expects the data segments within the group or SGT records. The program will read as many as there are before the next GRP record. Here is where the branch list gets built off of the minor node of MyNode for each SGT record read.

case “SGT”:
// Declare new instance of node — it will store the input rec
node = new LlNode();
node.data = sInRec;

// Declare a new working list
templist = new LinkedList();

// Insert node into working list
templist.PushNode(node,0);

// Check to see if there is already a “SGT” list in this loop node
// The Exists() method will return the index of the segment if true
bexists = MyList.Exists(MyNode,”SGT”,ref pindex);

Figure 5 shows the resulting structure after reading and processing the SGT segments:

Figure 5

The linked list just keeps building until the end of the flat file or another GRP record is encountered to indicate a new grouping. Note that in order to “push” the templist onto the ArrayList indicated in minornode, we must remove the old list first then insert the new list with the RemoveAt and Insert methods of the ArrayList class.

Upon encountering a new GRP record, we can extract, process, and manipulate the data in the linked list before discarding it. Before placing a new GRP segment in the new grouping, we initialize or create a new list node by doing: MyNode = new LlNode();. At the point where MyNode is declared new, the old memory allocated for the linked list branching from MyNode is gone and the .NET garbage collector comes along to reuse it. A new node with the same name, MyNode, is now prepared to point to a new list.

By using the method PrintMyList(node index), we can dump the contents of the entire list to the console. This method travels down the chain from the end to the start in order to print the list in first in first out order. Why is the list built in reverse? It is because we are always pushing a new node on at position 0 with the PushNode method. This is much easier than trying to figure out where the end of list is and append to the end. In the PrintMyList method, the order is reversed so that you have the impression that the records are on the list in original order.

A Recommendation

Use the Visual Studio or DBGCLR debuggers to correct any bugs found in your linked list implementation. Otherwise you will get thoroughly confused when trying to isolate problems in the linked list processing. These debuggers do an excellent job in “exploding” the linked list and letting you see the entire tree. Figure 6 shows a Visual Studio .NET debugging session for this program with the linked list exploded:

Figure 6


Some Final Thoughts

This is a simple example of a combination linked list and ArrayList. The intent is for you to be able to take this code and adapt and modify to fit your particular data processing needs. Without structures like this, we are forced to table grouped information in fixed-types of structures such as random access flat files, database tables, and fixed-length arrays. All of these structures are either memory-intensive or input/output intensive. The linked list is very efficient on memory and very fast. When you have to do large volume table processing of sequential data give this structure consideration. It just might do the trick for you.

16. Boxing and Unboxing in C# .Net

Introduction

In this article I will explain the concepts of Boxing and UnBoxing. C# provides us with Value types and Reference Types. Value Types are stored on the stack and Reference types are stored on the heap. The conversion of value type to reference type is known as boxing and converting reference type back to the value type is known as unboxing.

Let me explain you little more about Value and Reference Types.

Value Types

Value types are primitive types that are mapped directly to the FCL. Like Int32 maps to System.Int32, double maps to System.double. All value types are stored on stack and all the value types are derived from System.ValueType. All structures and enumerated types that are derived from System.ValueType are created on stack, hence known as ValueType.

Reference Types

Reference Types are different from value types in such a way that memory is allocated to them from the heap. All the classes are of reference type. C# new operator returns the memory address of the object.

Examples

Lets see some examples to have a better understanding of Value Types and Reference Types. Since we know that all ValueTypes are derived from System.Value we can write something like this:

System.ValueType r = 5;

So what do you think about the above line of code. Will it compile ? Yes it will compile. But wait what type is it cause I don’t remember any type which is called System.ValueType since its a base class from which all value types inherit. So is it Int32, Int64,double, decimal etc. It turns out that the type for variable ‘r’ is System.Int32. The Question arrises why Int32 and why not Int16. Well its because it is mapped to Int32 by default depending upon the Initial value of the variable.

You cannot write something like this since System.ValueType is not a primitive type its a base class for primitive value types and these mathematical operations can be performed on primitive types.

System.ValueType r = 10;

r++;

In the above example I told you that variable ‘r’ will be a System.Int32 variable but if you don’t believe me than you can find out yourself using the GetType() method:

System.ValueType r = 5;

Console.WriteLine(r.GetType()) // returns System.Int32;

Here are few samples you can try on your own:


System.ValueType r = 23.45;
Console.WriteLine(r.GetType()); // what does this print
//——————————————————-
System.ValueType r = 23.45F;
Console.WriteLine(r.GetType()); // What does this print
//——————————————————-
System.ValueType r = 2U;
Console.WriteLine(r.GetType()); // What does this print
//——————————————————-
System.ValueType r = ‘c’;
Console.WriteLine(r.GetType()); // What does this print
//——————————————————-
System.ValueType r = ‘ac’;
Console.WriteLine(r.GetType()); // tricky
//——————————————————-
System.ValueType r = “Hello World”;
Console.WriteLine(r.GetType()); // tricky

Boxing

Lets now jump to Boxing. Sometimes we need to convert ValueTypes to Reference Types also known as boxing. Lets see a small example below. You see in the example I wrote “implicit boxing” which means you don’t need to tell the compiler that you are boxing Int32 to object because it takes care of this itself although you can always make explicit boxing as seen below right after implicit boxing.


Int32 x = 10;
object o = x ;  // Implicit boxing
Console.WriteLine(“The Object o = {0}”,o); // prints out 10
//———————————————————–
Int32 x = 10;
object o = (object) x; // Explicit Boxing
Console.WriteLine(“The object o = {0}”,o); // prints out 10

Unboxing

Lets now see UnBoxing an object type back to value type. Here is a simple code that unbox an object back to Int32 variable. First we need to box it so that we can unbox.

Int32 x = 5;
object o = x; // Implicit Boxing
x = o; // Implicit UnBoxing

So, you see how easy it is to box and how easy it is to unbox. The above example first boxs Int32 variable to an object type and than simply unbox it to x again. All the conversions are taking place implicitly. Everything seems right in this example there is just one small problem which is that the above code is will not compile. You cannot Implicitly convert a reference type to a value type. You must explicitly specify that you are unboxing as shown in the code below.

Int32 x = 5;
object o = x; // Implicit Boxing
x = (Int32)o; // Explicit UnBoxing

Lets see another small example of unboxing.

Int32 x = 5; // declaring Int32
Int64 y = 0; // declaring Int64 double
object o = x; // Implicit Boxing
y = (Int64)o; // Explicit boxing to double
Console.WriteLine(“y={0}”,y);

This example will not work. It will compile successfully but at runtime It will generate an exception of System.InvalidCastException. The reason is variable x is boxed as Int32 variable so it must be unboxed to Int32 variable. So, the type the variable uses to box will remain the same when unboxing the same variable. Of course you can cast it to Int64 after unboxing it as Int32 as follows:

Int32 x = 5; // declaring Int32
Int64 y = 0; // declaring Int64 double
object o = x; // Implicit Boxing
y = (Int64)(Int32)o; // Unboxing and than casting to double
Console.WriteLine(“y={0}”,y);

I am sure that you all have grasp the basic understanding of Boxing and Unboxing. Happy Coding and practice a lot !

17. HTML Screen Scraping using C# .Net WebClient

What is Screen Scraping ?

Screen Scraping means reading the contents of a web page. Suppose you go to yahoo.com, what you see is the interface which includes buttons, links, images etc. What we don’t see is the target url of the links, the name of the images, the method used by the button which can be POST or GET. In other words we don’t see the HTML behind the pages. Screen Scraping pulls the HTML of the web page. This HTML includes every HTML tag that is used to make up the page.

Why use screen scraping ?

The question that comes to our mind is why do we ever want the HTML of any web page. Screen Scraping does not stop only on pulling out the HTML but displaying it also. In other words you can pull out the HTML from any web page and display that web page on your page. It can be used as frames. But the good thing about screen scraping is that it is supported by all browsers and frames unfortunately are not.

Also sometimes you go to a website which has many links which says image1, image2, image3 and so on. In order to see those images you have to click on the image and it will enlarge in the parent or the new window. By using screen scraping you can pull all the images from a particular web page and display them on your own page.

Displaying a web page on your own page using Screen Scraping :

Lets see a small code snippet which you can use to display any page on your own page. First make a small interface as I have made below. As you can see the interface is quite simple. It has a button which says “Display WebPages below” and the web page trust me or not will be displayed in place of label. All the code will be written for the Button Click event. Below you can see the “Button Click Code”.

C# Button Click Code :

private void Button1_Click(object sender, System.EventArgs e)
{
WebClient webClient = new WebClient();
const string strUrl = “http://www.yahoo.com/&#8221;;
byte[] reqHTML;
reqHTML = webClient.DownloadData(strUrl);
UTF8Encoding objUTF8 = new UTF8Encoding();
lblWebpage.Text = objUTF8.GetString(reqHTML);

}

Explanation of the Code Snippet in C#:

As you can see the code is few lines long. This is because Microsoft.net has a very strong set of class libraries that makes the task easier for the developer. If you were trying to achieve the same result from classic Asp you might have to write a lot more code, I guess that’s good for all the coders out there in the programming world.

In the first line I made an object of the WebClient class. The WebClient class provides common methods for sending data to or receiving data from any local, intranet, or Internet resource identified by a URI.

In the next line we just defined a private string variable strUrl which holds the url of the web page we wish to use in our example.

Then we declared a byte array reqHTML which will hold the bytes transferred from the web page.

Next line downloads the data in the form of bytes and put them in the reqHTML byte array.

The UTF8Encoding class represents the UTF-8 encoding of Unicode characters.

And in the next line we use the UTF8Encoding class method GetString to get the bytes as a string representation and finally we binds the result to the label.

This code now gets the www.yahoo.com homepage when the label is bound with the HTML of the yahoo page. The whole yahoo page is displayed.

The Generated HTML :

For those curious people who want to see that HTML was generated when the request was made. You can easily view the HTML by just viewing the source code of the yahoo page. In our internet explorer go to View -> Source. The notepad will open with the complete HTML generated of the page. Lets see a small screen shot of the HTML generated when we visit yahoo.com. As you can see the HTML generated is quite complex. Wouldn’t it be really cool if you can extract out all the links from the generated source. Lets try to do that 🙂

Extracting Urls :

The first thing you need to extract all the Urls from the web page is the regular expression. I am not saying you cannot do this without regular expression you can but it will be much harder.

Regular Expression for Extracting Urls :

First you need to introduce System.Text.RegularExpressions. Next you need to make a regular expression that can extract all urls from the generated HTML. There are many regular expressions already made for you which you can view at http://www.regexlib.com/ . Your regular expression would like this:

Regex r = new Regex(“href\\s*=\\s*(?:(?:\\\”(?[^\\\”]*)\\\”)|(?[^\\s]* ))”);

This just says that extract everything from the web page source which starts with “href\\”

User Interface in Visual Studio .Net:

I am keeping user interface pretty simple. It consist of a textbox, datagrid and button. The datagrid will be used to display all the extracted urls.

Here is a screen shot of the User Interface.

The Code:

Okay the code is implemented in the button click event. But before that lets see the important declarations. You also need to include the following namespaces:
System.Net;

System.Text;

System.IO // If you plan to write in a file

// creates a button protected System.Web.UI.WebControls.Button Button1; // creates a byte array private byte[] aRequestHTML; // creates a string private string myString = null; // creates a datagrid protected System.Web.UI.WebControls.DataGrid DataGrid1; // creates a textbox protected System.Web.UI.WebControls.TextBox TextBox1; // creates the label protected System.Web.UI.WebControls.Label Label1; // creates the arraylist private ArrayList a = new ArrayList();

Okay now lets see some button click code that does the actual work.

private void Button1_Click(object sender, System.EventArgs e)
{
// make an object of the WebClient class
WebClient objWebClient = new WebClient();
// gets the HTML from the url written in the textbox
aRequestHTML = objWebClient.DownloadData(TextBox1.Text);
// creates UTf8 encoding object
UTF8Encoding utf8 = new UTF8Encoding();
// gets the UTF8 encoding of all the html we got in aRequestHTML
myString = utf8.GetString(aRequestHTML);
// this is a regular expression to check for the urls
Regex r = new Regex(“href\\s*=\\s*(?:(?:\\\”(?[^\\\”]*)\\\”)|(?[^\\s]* ))”);
// get all the matches depending upon the regular expression
MatchCollection mcl = r.Matches(myString);

foreach(Match ml in mcl)
{
foreach(Group g in ml.Groups)
{
string b = g.Value + ”
“;
// Add the extracted urls to the array list
a.Add(b);

}
}
// assign arraylist to the datasource
DataGrid1.DataSource = a;
// binds the databind
DataGrid1.DataBind();

// The following lines of code writes the extracted Urls to the file named test.txt
StreamWriter sw = new StreamWriter(Server.MapPath(“test.txt”));
sw.Write(myString);
sw.Close();
}

The MatchCollection mc1 has all the extracted urls and you can iterate through the collection to get all of them. Once you enter the url in the textbox and press the button the datagrid will be populated with the extracted urls. Here is a screen shot of the datagrid. The screen shot only shows few urls extracted there are at least 50 of them.  

Final Note:

As you see that its simple to extract urls from any web page. You can also make the Column in the datagrid a hyperlink column so you can browse the extracted url.

Advertisements

Thanks a lot for visiting this site .. Please leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s