Monthly Archives: May 2016

CAP Theorem – Consistency Availability and partition Tolerance

  • Consistency – It means that data is the same across the cluster, so you can read or write to/from any node and get the same data.
  • Availability – It means the ability to access the cluster even if a node in the cluster goes down.
  • Partition Tolerance – It means that the cluster continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

In order to get both availability and partition tolerance, you have to give up consistency. Consider if you have two nodes, X and Y, in a master-master setup. Now, there is a break between network communication in X and Y, so they can’t sync updates. At this point you can either:

A) Allow the nodes to get out of sync (giving up consistency), or

B) Consider the cluster to be “down” (giving up availability)

All the combinations available are:

  • CA – data is consistent between all nodes – as long as all nodes are online – and you can read/write from any node and be sure that the data is the same, but if you ever develop a partition between nodes, the data will be out of sync (and won’t re-sync once the partition is resolved). In practice however, there are of course CA systems, e.g. single-node RDBMS. Or even master-slave/master-master replicated RDBMS, provided there’s a central router knowing which nodes live and directing client appropriately. Such a router is then a single poin
  • CP – data is consistent between all nodes, and maintains partition tolerance (preventing data desync) by becoming unavailable when a node goes down.
  • AP – nodes remain online even if they can’t communicate with each other and will re-sync data once the partition is resolved, but you aren’t guaranteed that all nodes will have the same data (either during or after the partition).

CAP triangle can be visualized like shown below

CAPTriangle

CAP – CA, CP and AP in summary can be visualized like shown below, all below doagrams thanks to Slide share presentations

CAP-AP-CP-CA-Combinations

Choosing AP

CAP-AP

Choosing CP

CAP-CP

 

Every one is aware of ACID

  • Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.
  • Consistent: A transaction cannot leave the database in an inconsistent state.
  • Isolated: Transactions cannot interfere with each other.
  • Durable: Completed transactions persist, even when servers restart etc.

ACID transactions are far more pessimistic (i.e., they’re more worried about data safety) than the domain actually requires.

In the NoSQL world, ACID transactions are less fashionable as some databases have loosened the requirements for immediate consistency, data freshness and accuracy in order to gain other benefits, like scale and resilience. THta’s where BASE acronym comes into picture

Basic Availability

    • The database appears to work most of the time.

Soft-state

    • Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time.

Eventual consistency

    • Stores exhibit consistency at some later point (e.g., lazily at read time).

 

 

Advertisements

HTTP Methods PUT, POST, Idempotent, Safe…

  • Idempotent – An idempotent HTTP method is a HTTP method that can be called many times without different outcomes. It would not matter if the method is called only once, or ten times over. The result should be the same. Again, this only applies to the result, not the resource itself. This still can be manipulated (like an update-timestamp, provided this information is not shared in the (current) resource representation. Consider the following examples:
    a = 4;
    a++;
    The first example is idempotent: no matter how many times we execute this statement, a will always be 4. The second example is not idempotent. Executing this 10 times will result in a different outcome as when running 5 times. Since both examples are changing the value of a, both are non-safe methods.
  • Safe Methods – Safe methods are HTTP methods that do not modify resources. For instance, using GET or HEAD on a resource URL, should NEVER change the resource. However, this is not completely true. It means: it won’t change the resource representation. It is still possible, that safe methods do change things on a server or resource, but this should not reflect in a different representation.
  • PUT and POST
    • Post – The POST verb is most-often utilized to **create** new resources. In particular, it’s used to create subordinate resources. That is, subordinate to some other (e.g. parent) resource. In other words, when creating a new resource, POST to the parent and the service takes care of associating the new resource with the parent, assigning an ID (new resource URI), etc. On successful creation, return HTTP status 201, returning a Location header with a link to the newly-created resource with the 201 HTTP status. POST is neither safe nor idempotent. It is therefore recommended for non-idempotent resource requests. Making two identical POST requests will most-likely result in two resources containing the same information.
      Examples:
      POST http://www.example.com/customers
      POST http://www.example.com/customers/12345/orders
    • Put – PUT is most-often utilized for **update** capabilities, PUT-ing to a known resource URI with the request body containing the newly updated representation of the original resource.However, PUT can also be used to create a resource in the case where the resource ID is chosen by the client instead of by the server. In other words, if the PUT is to a URI that contains the value of a non-existent resource ID. Again, the request body contains a resource representation. Many feel this is convoluted and confusing. Consequently, this method of creation should be used sparingly, if at all.Alternatively, use POST to create new resources and provide the client-defined ID in the body representation—presumably to a URI that doesn’t include the ID of the resource (see POST below).On successful update, return 200 (or 204 if not returning any content in the body) from a PUT. If using PUT for create, return HTTP status 201 on successful creation. A body in the response is optional—providing one consumes more bandwidth. It is not necessary to return a link via a Location header in the creation case since the client already set the resource ID.PUT is not a safe operation, in that it modifies (or creates) state on the server, but it is idempotent. In other words, if you create or update a resource using PUT and then make that same call again, the resource is still there and still has the same state as it did with the first call.If, for instance, calling PUT on a resource increments a counter within the resource, the call is no longer idempotent. Sometimes that happens and it may be enough to document that the call is not idempotent. However, it’s recommended to keep PUT requests idempotent. It is strongly recommended to use POST for non-idempotent requests.Examples:
      PUT http://www.example.com/customers/12345
      PUT http://www.example.com/customers/12345/orders/98765
      PUT http://www.example.com/buckets/secret_stuff
  • PATCH – PATCH is used for **modify** capabilities. The PATCH request only needs to contain the changes to the resource, not the complete resource. This resembles PUT, but the body contains a set of instructions describing how a resource currently residing on the server should be modified to produce a new version. This means that the PATCH body should not just be a modified part of the resource, but in some kind of patch language like JSON Patch or XML Patch. PATCH is neither safe nor idempotent. However, a PATCH request can be issued in such a way as to be idempotent, which also helps prevent bad outcomes from collisions between two PATCH requests on the same resource in a similar time frame. Collisions from multiple PATCH requests may be more dangerous than PUT collisions because some patch formats need to operate from a known base-point or else they will corrupt the resource. Clients using this kind of patch application should use a conditional request such that the request will fail if the resource has been updated since the client last accessed the resource. For example, the client can use a strong ETag in an If-Match header on the PATCH request. Examples:
    PATCH http://www.example.com/customers/12345
    PATCH http://www.example.com/customers/12345/orders/98765
    PATCH http://www.example.com/buckets/secret_stuff
  • DELETE is pretty easy to understand. It is used to **delete** a resource identified by a URI. On successful deletion, return HTTP status 200 (OK) along with a response body, perhaps the representation of the deleted item (often demands too much bandwidth), or a wrapped response (see Return Values below). Either that or return HTTP status 204 (NO CONTENT) with no response body. In other words, a 204 status with no body. HTTP-spec-wise, DELETE operations are idempotent. If you DELETE a resource, it’s removed. Repeatedly calling DELETE on that resource ends up the same: the resource is gone.  There is a caveat about DELETE idempotence, however. Calling DELETE on a resource a second time will often return a 404 (NOT FOUND) since it was already removed and therefore is no longer findable. This, by some opinions, makes DELETE operations no longer idempotent, however, the end-state of the resource is the same. Returning a 404 is acceptable and communicates accurately the status of the call.
    Examples: DELETE http://www.example.com/customers/12345 DELETE http://www.example.com/customers/12345/orders DELETE http://www.example.com/bucket/sample
  • GET – HTTP GET method is used to **read** (or retrieve) a representation of a resource. In the “happy” (or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200 (OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST). According to the design of the HTTP specification, GET (along with HEAD) requests are used only to read data and not change it. Therefore, when used this way, they are considered safe. That is, they can be called without risk of data modification or corruption—calling it once has the same effect as calling it 10 times, or none at all. Additionally, GET (and HEAD) is idempotent, which means that making multiple identical requests ends up having the same result as a single request. Do not expose unsafe operations via GET—it should never modify any resources on the server.
    Examples: GET http://www.example.com/customers/12345 GET http://www.example.com/customers/12345/orders GET http://www.example.com/buckets/sample
  • HEAD – The HEAD method is functionally similar to GET, except that the server replies with a response line and headers, but no entity-body, the server MUST NOT return a message-body in the response. Typically HEAD will be used for getting metadata information for the resource. The meta information contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining meta information about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification. In cases where the content of a resource is not particularly important, but the existence of the resource is, HEAD is perfect. We can do everything just like a GET and check the response code, without the weight of the response body.
  • CONNECT Method – The CONNECT method is used by the client to establish a network connection to a web server over HTTP.
  • OPTIONS Method – The OPTIONS method is used by the client to find out the HTTP methods and other options supported by a web server. The client can specify a URL for the OPTIONS method, or an asterisk (*) to refer to the entire server.
HTTP Method Idempotent Safe
Options YES YES
GET YES YES
Head YES YES
PUT YES NO
POST NO NO
DELETE YES NO
PATCH NO NO

Twelve Factor Apps…

Influenced by  12 Fact Apps Principles.

  • Codebase- One codebase tracked in revision control, many deploys. A codebase is any single repo or any set of repos who share a root commit (in a decentralized revision control system like Git). One codebase maps to many deploys.There is always a one-to-one correlation between the codebase and the app:
      • If there are multiple codebases, it’s not an app – it’s a distributed system. Each component in a distributed system is an app, and each can individually comply with twelve-factor.
      • Multiple apps sharing the same code is a violation of twelve-factor. The solution here is to factor shared code into libraries which can be included through the dependency manager.

    There is only one codebase per app, but there will be many deploys of the app. A deploy is a running instance of the app. This is typically a production site, and one or more staging sites. Additionally, every developer has a copy of the app running in their local development environment, each of which also qualifies as a deploy.

  • Config – Store config in the environment. An app’s config is everything that is likely to vary between deploys (staging, production, developer environments, etc). This includes connection strings, Resource handles to the database, Memcached, and other backing services, Credentials to external services such as Amazon S3 or Twitter etc. The twelve-factor app stores config in environment variables (often shortened to env vars or env). Env vars are easy to change between deploys without changing any code; unlike config files, there is little chance of them being checked into the code repo accidentally; and unlike custom config files, or other config mechanisms, they are a language- and OS-agnostic standard.
  • Backing services – Treat backing services as attached resources. A backing service is any service the app consumes over the network as part of its normal operation. Examples include datastores (such as SQl Server), messaging/queueing systems , SMTP services for outbound email , and caching systems (such as Redis). The code for a twelve-factor app makes no distinction between local and third party services. To the app, both are attached resources, accessed via a URL or other locator/credentials stored in the config. A deploy of the twelve-factor app should be able to swap out a local SQl server database with one managed by a third party (such as Azure) without any changes to the app’s code. Likewise, a local SMTP server could be swapped with a third-party SMTP service without code changes. In both cases, only the resource handle in the config needs to change. Resources can be attached and detached to deploys at will. For example, if the app’s database is misbehaving due to a hardware issue, the app’s administrator might spin up a new database server restored from a recent backup. The current production database could be detached, and the new database attached – all without any code changes.
  • Build, release, run – Strictly separate build and run stages. A codebase is transformed into a (non-development) deploy through three stages:
      • The build stage is a transform which converts a code repo into an executable bundle known as a build. Using a version of the code at a commit specified by the deployment process, the build stage fetches vendors dependencies and compiles binaries and assets.
      • The release stage takes the build produced by the build stage and combines it with the deploy’s current config. The resulting release contains both the build and the config and is ready for immediate execution in the execution environment.
      • The run stage (also known as “runtime”) runs the app in the execution environment, by launching some set of the app’s processes against a selected release.

    The twelve-factor app uses strict separation between the build, release, and run stages. For example, it is impossible to make changes to the code at runtime, since there is no way to propagate those changes back to the build stage.Builds are initiated by the app’s developers whenever new code is deployed. Runtime execution, by contrast, can happen automatically in cases such as a server reboot, or a crashed process being restarted by the process manager. Therefore, the run stage should be kept to as few moving parts as possible, since problems that prevent an app from running can cause it to break in the middle of the night when no developers are on hand. The build stage can be more complex, since errors are always in the foreground for a developer who is driving the deploy.

  • Processes – Execute the app as one or more stateless processes. The app is executed in the execution environment as one or more processes.Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database or cache. The memory space or filesystem of the process can be used as a brief, single-transaction cache. For example, downloading a large file, operating on it, and storing the results of the operation in the database. The twelve-factor app never assumes that anything cached in memory or on disk will be available on a future request or job – with many processes of each type running, chances are high that a future request will be served by a different process. Even when running only one process, a restart (triggered by code deploy, config change, or the execution environment relocating the process to a different physical location) will usually wipe out all local (e.g., memory and filesystem) state. Some web systems rely on “sticky sessions” – that is, caching user session data in memory of the app’s process and expecting future requests from the same visitor to be routed to the same process. Sticky sessions are a violation of twelve-factor and should never be used or relied upon. Session state data is a good candidate for a datastore that offers time-expiration, such as Memcached or Redis.
  • Port binding – Export services via port binding. Web apps are sometimes executed inside a webserver container. The twelve-factor app is completely self-contained and does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.In a local development environment, the developer visits a service URL like http://localhost:5000/ to access the service exported by their app. In deployment, a routing layer handles routing requests from a public-facing hostname to the port-bound web processes. Note also that the port-binding approach means that one app can become the backing service for another app, by providing the URL to the backing app as a resource handle in the config for the consuming app.
  • Concurrency – Scale out via the process model
    Every process inside your application should be treated as a first-class citizen. That means that each process should be able to scale, restart, or clone itself when needed. This approach will improve the sustainability and scalability of your application as a whole. Using this model, the developer can architect their app to handle diverse workloads by assigning each type of work to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process. The process model truly shines when it comes time to scale out. The share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation. The array of process types and number of processes of each type is known as the process formation.
  • Disposability – Maximize robustness with fast startup and graceful shutdown. The twelve-factor app’s processes are disposable, meaning they can be started or stopped at a moment’s notice. This facilitates fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys. Processes should strive to minimize startup time. Ideally, a process takes a few seconds from the time the launch command is executed until the process is up and ready to receive requests or jobs. Short startup time provides more agility for the release process and scaling up; and it aids robustness, because the process manager can more easily move processes to new physical machines when warranted. Processes shut down gracefully when they receive a signal from the process manager. For a web process, graceful shutdown is achieved by ceasing to listen on the service port (thereby refusing any new requests), allowing any current requests to finish, and then exiting. Implicit in this model is that HTTP requests are short (no more than a few seconds), or in the case of long polling, the client should seamlessly attempt to reconnect when the connection is lost.Processes should also be robust against sudden death, in the case of a failure in the underlying hardware. While this is a much less common occurrence than a graceful shutdown
  • Dev/prod parity – Keep development, staging, and production as similar as possible. Historically, there have been substantial gaps between development (a developer making live edits to a local deploy of the app) and production (a running deploy of the app accessed by end users). These gaps manifest in three areas:
    • The time gap: A developer may work on code that takes days, weeks, or even months to go into production.
    • The personnel gap: Developers write code, ops engineers deploy it.
    • The tools gap: Developers may be using a certain stack while the production deploy uses different stack.

    The twelve-factor app is designed for continuous deployment by keeping the gap between development and production small. Looking at the three gaps described above:

        • Make the time gap small: a developer may write code and have it deployed hours or even just minutes later.
        • Make the personnel gap small: developers who wrote code are closely involved in deploying it and watching its behavior in production.
        • Make the tools gap small: keep development and production as similar as possible.

    Developers sometimes find great appeal in using a lightweight backing service in their local environments, while a more serious and robust backing service will be used in production. For example local process memory for caching in development and Memcached in production. The twelve-factor developer resists the urge to use different backing services between development and production, even when adapters theoretically abstract away any differences in backing services. Differences between backing services mean that tiny incompatibilities crop up, causing code that worked and passed tests in development or staging to fail in production. These types of errors create friction that disincentivizes continuous deployment. The cost of this friction and the subsequent dampening of continuous deployment is extremely high when considered in aggregate over the lifetime of an application.

  • Logs – Treat logs as event streamsLogs provide visibility into the behavior of a running app. In server-based environments they are commonly written to a file on disk (a “logfile”); but this is only an output format.Logs are the stream of aggregated, time-ordered events collected from the output streams of all running processes and backing services. Logs in their raw form are typically a text format with one event per line (though backtraces from exceptions may span multiple lines). Logs have no fixed beginning or end, but flow continuously as long as the app is operating.A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout. During local development, the developer will view this stream in the foreground of their terminal to observe the app’s behavior.

    In staging or production deploys, each process’ stream will be captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and long-term archival. These archival destinations are not visible to or configurable by the app, and instead are completely managed by the execution environment. Open-source log routers (such as Logplex and Fluent) are available for this purpose.

    The event stream for an app can be routed to a file, or watched via realtime tail in a terminal. Most significantly, the stream can be sent to a log indexing and analysis system such as Splunk, or a general-purpose data warehousing system such as Hadoop/Hive. These systems allow for great power and flexibility for introspecting an app’s behavior over time, including:

    Finding specific events in the past.
    Large-scale graphing of trends (such as requests per minute).
    Active alerting according to user-defined heuristics (such as an alert when the quantity of errors per minute exceeds a certain threshold).

  • Admin processes – Run admin/management tasks as one-off processes. The process formation is the array of processes that are used to do the app’s regular business (such as handling web requests) as it runs. Separately, developers will often wish to do one-off administrative or maintenance tasks for the app, such as:
    • Running database migrations
    • Running a console to run arbitrary code or inspect the app’s models against the live database.
    • Running one-time scripts committed into the app’s repo

    One-off admin processes should be run in an identical environment as the regular long-running processes of the app. They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues.

Some Java script nuances – null, undefined, NaN

Undefined

  • Javascript has a variable whose value is undefined and typeof undefined is also undefined
  • Undefined is not a type or constant
  • Undefined is a type with exactly one value: undefined
  • You can get Undefined in various different ways A declared variable without assigning any value to it. Implicit returns of functions due to missing return statements. Return statements that do not explicitly return anything. Lookups of non-existent properties in an object. Function parameters that have not passed. Anything that has been set to the value of undefined

null

  • It means empty or non existent value, it is also used to indicate no value
  • It is primitive type and u can use null to assign to any variable. null is not an object it is primitive value
  • typeof null returns object. But null is not an object, you cannot add properties to it.

null == undefined – This returns true
null === undefined – This returns false

NaN

  • The NaN property represents a value that is ‘not a number’. This special value results from an operation that could not be performed either because one of the operands was non-numeric (e.g., ‘abc’ / 4), or because the result of the operation is non-numeric (e.g., an attempt to divide by zero). While this seems straightforward enough, there are a couple of somewhat surprising characteristics of NaN that can result in hair-pulling bugs if one is not aware of them.
  • For one thing, although NaN means ‘not a number”, its type is, believe it or not, Number: console.log(typeof NaN === ‘number’); // logs ‘true’
  • NaN compared to anything – even itself! – is false: console.log(NaN === NaN); // logs ‘false’

C# and MongDB Driver – Building Dynamic Queries based on Free Text Input

In case if you are using Mongo DB with .NET using C# MongoDB drivers and you would be typically querying Mongo DB data store using LINQ queries using static Lambda expressions. for e.g.

           IQueryable<User> userQuery = db.Query;
            ////Suppose I have to query "John" or "Mary" or "Thaine".
            ////Easiest option is to build static expression
            userQuery = userQuery.Where(u => (u.FirstName.ToLower().Contains("John") ||
                                u.FirstName.ToLower().Contains("Mary") ||
                                u.FirstName.ToLower().Contains("Thaine")));

Where user entity looks some thing like this

    [BsonIgnoreExtraElements]
    public class User
    {
        public ObjectId Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string Email { get; set; }
        public string Employer { get; set; }
        public Address Address { get; set; }
    }

    [BsonIgnoreExtraElements]
    public class Address
    {
        public string StreetName1 { get; set; }
        public string StreetName2 { get; set; }
        public string City { get; set; }
        public string State { get; set; }
        public string Zipcode { get; set; }
    }

Building static expressions is nothing new which we use more often then not. But what in in your application if there is requirement to perform search based of free text entered by user for e.g. in application UI there is text box to enter free text search with and / or/ not operator like “John or Mary or Thaine”. Then in case of static expression we do something like above.

But this is not what we want, we want the end user through UI send the query like above one and end user could send any combinations of and and or. Above solution of static expression does not fit the requirement. So here is where building dynamic expression tree comes in handy. Sample query that end user could be doing through UI could be “Mary and Thaine” or it could be “Mary or John or Thaine” or “Mary and New York”. So instead of static expressions we build a dynamic expression tree like show below

            IQueryable<User> userQuery = db.Query;
            string searchTerm = "John or Mary or Thaine";
            userQuery = userQuery.Where(
                CSharpMongoDynamicQueries.BuildAdvanceSearchExpressionTree<User>(searchTerm));

 

Where CSharpMongoDynamicQueries.BuildAdvanceSearchExpressionTree returns dynamic expression based on search term.

CSharpMongoDynamicQueries looks some thing like this.

    public class CSharpMongoDynamicQueries
    {
        #region Private static members
        private static readonly string AndOperator = " and ";
        private static readonly string OrOperator = " or ";
        private static readonly string NotOperator = " not ";
        private static readonly string[] stringSeparators = new string[] { AndOperator, OrOperator };
        #endregion


        /// <summary>
        /// Builds the advance search expression tree.
        /// This enables dynamic query on Following attributes
        /// FirstName, LastName, Employer, Email, Address.City.
        /// It can be enabled on other fields as well, but i have just restricted as a sample ot these set of fields.
        /// </summary>
        /// <typeparam name="TSource">The type of the source.</typeparam>
        /// <param name="searchTerm">The search term.</param>
        /// <param name="search">The search.</param>
        /// <returns></returns>
        public static Expression<Func<TSource, bool>> BuildAdvanceSearchExpressionTree<TSource>(string searchTerm)
        {
            ParameterExpression pe = Expression.Parameter(typeof(TSource), "User");
            string[] result = searchTerm.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);
            var operatorIndexes = GetListOfSortedOperatorIndexes(searchTerm);
            Expression searchExpression = null;


            #region Employer
            var employerExpression = GetExpression<TSource>(result, searchTerm, "Employer", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? employerExpression : Expression.OrElse(searchExpression, employerExpression);
            #endregion

            #region Firstname
            var firstNameExpression = GetExpression<TSource>(result, searchTerm, "FirstName", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? firstNameExpression : Expression.OrElse(searchExpression, firstNameExpression);
            #endregion

            #region LastName
            var lastNameExpression = GetExpression<TSource>(result, searchTerm, "LastName", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? lastNameExpression : Expression.OrElse(searchExpression, lastNameExpression);
            #endregion

            #region Email
            var emailExpression = GetExpression<TSource>(result, searchTerm, "Email", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? emailExpression : Expression.OrElse(searchExpression, emailExpression);
            #endregion

            #region City
            var cityExpression = GetExpression<TSource>(result, searchTerm, "Address.City", operatorIndexes, pe, true);
            searchExpression = searchExpression == null ? cityExpression : Expression.OrElse(searchExpression, cityExpression);
            #endregion

            return Expression.Lambda<Func<TSource, bool>>(searchExpression, pe);
        }

        /// <summary>
        /// Gets the list of sorted operator indexes.
        /// </summary>
        /// <param name="input">The input.</param>
        /// <returns></returns>
        private static List<OperatorIndexes> GetListOfSortedOperatorIndexes(string input)
        {

            input = input.ToLower();
            string[] result = input.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);

            List<OperatorIndexes> operatorIndexes = new List<OperatorIndexes>();
            var andOperatorIndexes = AllIndexesOf(input, stringSeparators[0]);
            var orOperatorIndexes = AllIndexesOf(input, stringSeparators[1]);

            if (andOperatorIndexes.Count > 0)
            {
                operatorIndexes.AddRange(andOperatorIndexes);
            }

            if (orOperatorIndexes.Count > 0)
            {
                operatorIndexes.AddRange(orOperatorIndexes);
            }

            if (operatorIndexes.Count > 0)
            {
                var sorterOperatorIndexes = operatorIndexes.ToList<OperatorIndexes>().OrderBy(v => v.Index).ToList<OperatorIndexes>();
            }
            return operatorIndexes;
        }

        /// <summary>
        /// Alls the indexes of.
        /// </summary>
        /// <param name="str">The string.</param>
        /// <param name="value">The value.</param>
        /// <returns></returns>
        /// <exception cref="System.ArgumentException">the string to find may not be empty;value</exception>
        private static List<OperatorIndexes> AllIndexesOf(string str, string value)
        {
            if (String.IsNullOrEmpty(value))
                throw new ArgumentException("the string to find may not be empty", "value");
            List<OperatorIndexes> indexes = new List<OperatorIndexes>();
            for (int index = 0; ; index += value.Length)
            {
                index = str.IndexOf(value, index);
                if (index == -1)
                    return indexes;
                indexes.Add(new OperatorIndexes
                {
                    Index = index,
                    Operator = value
                });
            }
        }


        /// <summary>
        /// Gets the expression.
        /// </summary>
        /// <typeparam name="TSource">The type of the source.</typeparam>
        /// <param name="searchTerms">The search terms.</param>
        /// <param name="keyword">The keyword.</param>
        /// <param name="propertyName">Name of the property.</param>
        /// <param name="operatorIndexes">The operator indexes.</param>
        /// <param name="pe">The pe.</param>
        /// <param name="addParentObjectNullCheck">if set to <c>true</c> [add parent object null check].</param>
        /// <returns></returns>
        private static Expression GetExpression<TSource>(string[] searchTerms, string keyword, string propertyName,
            List<OperatorIndexes> operatorIndexes, ParameterExpression pe, bool addParentObjectNullCheck)
        {
            // Compose the expression tree that represents the parameter to the predicate.
            Expression propertyExp = pe;
            foreach (var member in propertyName.Split('.'))
            {
                propertyExp = Expression.PropertyOrField(propertyExp, member);
            }
            Expression searchExpression = null;
            Expression finalExpression = null;
            Expression nullorEmptyCheck = null;
            var left = Expression.Call(propertyExp, typeof(string).GetMethod("ToLower", Type.EmptyTypes));
            var method = typeof(string).GetMethod("Contains", new[] { typeof(string) });
            for (int count = 0; count < searchTerms.Length; count++)
            {
                var searchTerm = searchTerms[count].ToLower();
                searchTerm = searchTerm.Replace("*", string.Empty);
                searchTerm = searchTerm.Replace("\"", string.Empty);
                Expression methodCallExpresssion = null;
                Expression rightExpression = null;
                if (searchTerm.Contains(NotOperator.TrimStart()))
                {
                    searchTerm = searchTerm.Replace(NotOperator.TrimStart(), string.Empty).Trim();
                    rightExpression = Expression.Constant(searchTerm);
                    methodCallExpresssion = Expression.Call(left, method, rightExpression);
                    methodCallExpresssion = Expression.Not(methodCallExpresssion);
                }
                else
                {
                    rightExpression = Expression.Constant(searchTerm);
                    methodCallExpresssion = Expression.Call(left, method, rightExpression);
                }

                if (count == 0)
                {
                    searchExpression = methodCallExpresssion;
                }
                else
                {
                    var conditionOperator = operatorIndexes[count - 1].Operator.Trim();
                    switch (conditionOperator)
                    {
                        case "and":
                            searchExpression = Expression.AndAlso(searchExpression, methodCallExpresssion);
                            break;
                        case "or":
                            searchExpression = Expression.OrElse(searchExpression, methodCallExpresssion);
                            break;
                        default:
                            break;
                    }
                }
            }

            if (addParentObjectNullCheck)
            {
                //Add Null check for Address Object before checking the value of City.
                var nullCheck = Expression.NotEqual(Expression.PropertyOrField(pe, "Address"), Expression.Constant(null, typeof(object)));
                nullorEmptyCheck = Expression.Not(Expression.Call(typeof(string), (typeof(string).GetMethod("IsNullOrEmpty")).Name, null, propertyExp));
                finalExpression = Expression.AndAlso(nullCheck, nullorEmptyCheck);
                finalExpression = Expression.AndAlso(finalExpression, searchExpression);
            }
            else
            {
                nullorEmptyCheck = Expression.Not(Expression.Call(typeof(string), (typeof(string).GetMethod("IsNullOrEmpty")).Name, null, propertyExp));
                finalExpression = Expression.AndAlso(nullorEmptyCheck, searchExpression);
            }

            return finalExpression;
        }
    }

    internal class OperatorIndexes
    {
        public int Index { get; set; }
        public string Operator { get; set; }
    }

 

This sample does “or” search on all the searchable fields but you can change it to target to certain fields so that your query becomes more clear and crisp.

GetExpression method is the heart of this it builds the dynamic expression tree for a particular field. Also expression tree is doing contains search you can change it to have exact search as well. This method add null check on the searchable fields and on the parent object as well (Address.City).

GetExpression expects to have searchTerms array as parameter, this array is build based on splitting SearchTerm based on and/or operator. This array is been built is appropriate sequence by private methods GetListOfSortedOperatorIndexes and AllIndexesOf, which are quite self explanatory, these methods could be more optimized as well.

Sample is tightly bound to User entity but this can be applied to any object which gets stored in mongo or any other NON SQL Db as long as it uses the concept of IQueryable and expression trees.

In above sample the for the sample query “John or Mary or Thaine”  CSharpMongoDynamicQueries returned expression tree something like shown below

User => (((((Not(IsNullOrEmpty(User.Employer)) 
AndAlso ((User.Employer.ToLower().Contains("john") 
OrElse User.Employer.ToLower().Contains("mary"))
OrElse User.Employer.ToLower().Contains("thaine"))) 
OrElse (Not(IsNullOrEmpty(User.FirstName)) 
AndAlso ((User.FirstName.ToLower().Contains("john") 
OrElse User.FirstName.ToLower().Contains("mary")) 
OrElse User.FirstName.ToLower().Contains("thaine")))) 
OrElse (Not(IsNullOrEmpty(User.LastName)) 
AndAlso ((User.LastName.ToLower().Contains("john") 
OrElse User.LastName.ToLower().Contains("mary")) 
OrElse User.LastName.ToLower().Contains("thaine")))) 
OrElse (Not(IsNullOrEmpty(User.Email)) 
AndAlso ((User.Email.ToLower().Contains("john") 
OrElse User.Email.ToLower().Contains("mary")) 
OrElse User.Email.ToLower().Contains("thaine")))) 
OrElse (((User.Address != null) 
AndAlso Not(IsNullOrEmpty(User.Address.City))) 
AndAlso ((User.Address.City.ToLower().Contains("john") 
OrElse User.Address.City.ToLower().Contains("mary")) 
OrElse User.Address.City.ToLower().Contains("thaine"))))

 

Source code for the sample can be found over here : Source Code

if you planning to run the sample please ensure to change the Mongo connection string in app.config. Change <dbusername>, <dbpassword> and <dbname> as appropriate as per your dev environment.

<add key="MongoConnectionString" value="mongodb://<dbusername>:<dbpassword>@ds030719.mlab.com:30719/<dbname>"/>

 

Please tweet me at @mytechnetnohows in case if I am not able to clarify things over here.