Category Archives: Mongo DB

Mongo Shell – Display More than 20 records

How to print out more than 20 items (documents) in MongoDB’s shell?

Simple use this command on Mongo Shell

DBQuery.shellBatchSize = 300

 

 

Advertisements

CAP Theorem – Consistency Availability and partition Tolerance

  • Consistency – It means that data is the same across the cluster, so you can read or write to/from any node and get the same data.
  • Availability – It means the ability to access the cluster even if a node in the cluster goes down.
  • Partition Tolerance – It means that the cluster continues to function even if there is a “partition” (communications break) between two nodes (both nodes are up, but can’t communicate).

In order to get both availability and partition tolerance, you have to give up consistency. Consider if you have two nodes, X and Y, in a master-master setup. Now, there is a break between network communication in X and Y, so they can’t sync updates. At this point you can either:

A) Allow the nodes to get out of sync (giving up consistency), or

B) Consider the cluster to be “down” (giving up availability)

All the combinations available are:

  • CA – data is consistent between all nodes – as long as all nodes are online – and you can read/write from any node and be sure that the data is the same, but if you ever develop a partition between nodes, the data will be out of sync (and won’t re-sync once the partition is resolved). In practice however, there are of course CA systems, e.g. single-node RDBMS. Or even master-slave/master-master replicated RDBMS, provided there’s a central router knowing which nodes live and directing client appropriately. Such a router is then a single poin
  • CP – data is consistent between all nodes, and maintains partition tolerance (preventing data desync) by becoming unavailable when a node goes down.
  • AP – nodes remain online even if they can’t communicate with each other and will re-sync data once the partition is resolved, but you aren’t guaranteed that all nodes will have the same data (either during or after the partition).

CAP triangle can be visualized like shown below

CAPTriangle

CAP – CA, CP and AP in summary can be visualized like shown below, all below doagrams thanks to Slide share presentations

CAP-AP-CP-CA-Combinations

Choosing AP

CAP-AP

Choosing CP

CAP-CP

 

Every one is aware of ACID

  • Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.
  • Consistent: A transaction cannot leave the database in an inconsistent state.
  • Isolated: Transactions cannot interfere with each other.
  • Durable: Completed transactions persist, even when servers restart etc.

ACID transactions are far more pessimistic (i.e., they’re more worried about data safety) than the domain actually requires.

In the NoSQL world, ACID transactions are less fashionable as some databases have loosened the requirements for immediate consistency, data freshness and accuracy in order to gain other benefits, like scale and resilience. THta’s where BASE acronym comes into picture

Basic Availability

    • The database appears to work most of the time.

Soft-state

    • Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time.

Eventual consistency

    • Stores exhibit consistency at some later point (e.g., lazily at read time).

 

 

C# and MongDB Driver – Building Dynamic Queries based on Free Text Input

In case if you are using Mongo DB with .NET using C# MongoDB drivers and you would be typically querying Mongo DB data store using LINQ queries using static Lambda expressions. for e.g.

           IQueryable<User> userQuery = db.Query;
            ////Suppose I have to query "John" or "Mary" or "Thaine".
            ////Easiest option is to build static expression
            userQuery = userQuery.Where(u => (u.FirstName.ToLower().Contains("John") ||
                                u.FirstName.ToLower().Contains("Mary") ||
                                u.FirstName.ToLower().Contains("Thaine")));

Where user entity looks some thing like this

    [BsonIgnoreExtraElements]
    public class User
    {
        public ObjectId Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string Email { get; set; }
        public string Employer { get; set; }
        public Address Address { get; set; }
    }

    [BsonIgnoreExtraElements]
    public class Address
    {
        public string StreetName1 { get; set; }
        public string StreetName2 { get; set; }
        public string City { get; set; }
        public string State { get; set; }
        public string Zipcode { get; set; }
    }

Building static expressions is nothing new which we use more often then not. But what in in your application if there is requirement to perform search based of free text entered by user for e.g. in application UI there is text box to enter free text search with and / or/ not operator like “John or Mary or Thaine”. Then in case of static expression we do something like above.

But this is not what we want, we want the end user through UI send the query like above one and end user could send any combinations of and and or. Above solution of static expression does not fit the requirement. So here is where building dynamic expression tree comes in handy. Sample query that end user could be doing through UI could be “Mary and Thaine” or it could be “Mary or John or Thaine” or “Mary and New York”. So instead of static expressions we build a dynamic expression tree like show below

            IQueryable<User> userQuery = db.Query;
            string searchTerm = "John or Mary or Thaine";
            userQuery = userQuery.Where(
                CSharpMongoDynamicQueries.BuildAdvanceSearchExpressionTree<User>(searchTerm));

 

Where CSharpMongoDynamicQueries.BuildAdvanceSearchExpressionTree returns dynamic expression based on search term.

CSharpMongoDynamicQueries looks some thing like this.

    public class CSharpMongoDynamicQueries
    {
        #region Private static members
        private static readonly string AndOperator = " and ";
        private static readonly string OrOperator = " or ";
        private static readonly string NotOperator = " not ";
        private static readonly string[] stringSeparators = new string[] { AndOperator, OrOperator };
        #endregion


        /// <summary>
        /// Builds the advance search expression tree.
        /// This enables dynamic query on Following attributes
        /// FirstName, LastName, Employer, Email, Address.City.
        /// It can be enabled on other fields as well, but i have just restricted as a sample ot these set of fields.
        /// </summary>
        /// <typeparam name="TSource">The type of the source.</typeparam>
        /// <param name="searchTerm">The search term.</param>
        /// <param name="search">The search.</param>
        /// <returns></returns>
        public static Expression<Func<TSource, bool>> BuildAdvanceSearchExpressionTree<TSource>(string searchTerm)
        {
            ParameterExpression pe = Expression.Parameter(typeof(TSource), "User");
            string[] result = searchTerm.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);
            var operatorIndexes = GetListOfSortedOperatorIndexes(searchTerm);
            Expression searchExpression = null;


            #region Employer
            var employerExpression = GetExpression<TSource>(result, searchTerm, "Employer", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? employerExpression : Expression.OrElse(searchExpression, employerExpression);
            #endregion

            #region Firstname
            var firstNameExpression = GetExpression<TSource>(result, searchTerm, "FirstName", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? firstNameExpression : Expression.OrElse(searchExpression, firstNameExpression);
            #endregion

            #region LastName
            var lastNameExpression = GetExpression<TSource>(result, searchTerm, "LastName", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? lastNameExpression : Expression.OrElse(searchExpression, lastNameExpression);
            #endregion

            #region Email
            var emailExpression = GetExpression<TSource>(result, searchTerm, "Email", operatorIndexes, pe, false);
            searchExpression = searchExpression == null ? emailExpression : Expression.OrElse(searchExpression, emailExpression);
            #endregion

            #region City
            var cityExpression = GetExpression<TSource>(result, searchTerm, "Address.City", operatorIndexes, pe, true);
            searchExpression = searchExpression == null ? cityExpression : Expression.OrElse(searchExpression, cityExpression);
            #endregion

            return Expression.Lambda<Func<TSource, bool>>(searchExpression, pe);
        }

        /// <summary>
        /// Gets the list of sorted operator indexes.
        /// </summary>
        /// <param name="input">The input.</param>
        /// <returns></returns>
        private static List<OperatorIndexes> GetListOfSortedOperatorIndexes(string input)
        {

            input = input.ToLower();
            string[] result = input.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);

            List<OperatorIndexes> operatorIndexes = new List<OperatorIndexes>();
            var andOperatorIndexes = AllIndexesOf(input, stringSeparators[0]);
            var orOperatorIndexes = AllIndexesOf(input, stringSeparators[1]);

            if (andOperatorIndexes.Count > 0)
            {
                operatorIndexes.AddRange(andOperatorIndexes);
            }

            if (orOperatorIndexes.Count > 0)
            {
                operatorIndexes.AddRange(orOperatorIndexes);
            }

            if (operatorIndexes.Count > 0)
            {
                var sorterOperatorIndexes = operatorIndexes.ToList<OperatorIndexes>().OrderBy(v => v.Index).ToList<OperatorIndexes>();
            }
            return operatorIndexes;
        }

        /// <summary>
        /// Alls the indexes of.
        /// </summary>
        /// <param name="str">The string.</param>
        /// <param name="value">The value.</param>
        /// <returns></returns>
        /// <exception cref="System.ArgumentException">the string to find may not be empty;value</exception>
        private static List<OperatorIndexes> AllIndexesOf(string str, string value)
        {
            if (String.IsNullOrEmpty(value))
                throw new ArgumentException("the string to find may not be empty", "value");
            List<OperatorIndexes> indexes = new List<OperatorIndexes>();
            for (int index = 0; ; index += value.Length)
            {
                index = str.IndexOf(value, index);
                if (index == -1)
                    return indexes;
                indexes.Add(new OperatorIndexes
                {
                    Index = index,
                    Operator = value
                });
            }
        }


        /// <summary>
        /// Gets the expression.
        /// </summary>
        /// <typeparam name="TSource">The type of the source.</typeparam>
        /// <param name="searchTerms">The search terms.</param>
        /// <param name="keyword">The keyword.</param>
        /// <param name="propertyName">Name of the property.</param>
        /// <param name="operatorIndexes">The operator indexes.</param>
        /// <param name="pe">The pe.</param>
        /// <param name="addParentObjectNullCheck">if set to <c>true</c> [add parent object null check].</param>
        /// <returns></returns>
        private static Expression GetExpression<TSource>(string[] searchTerms, string keyword, string propertyName,
            List<OperatorIndexes> operatorIndexes, ParameterExpression pe, bool addParentObjectNullCheck)
        {
            // Compose the expression tree that represents the parameter to the predicate.
            Expression propertyExp = pe;
            foreach (var member in propertyName.Split('.'))
            {
                propertyExp = Expression.PropertyOrField(propertyExp, member);
            }
            Expression searchExpression = null;
            Expression finalExpression = null;
            Expression nullorEmptyCheck = null;
            var left = Expression.Call(propertyExp, typeof(string).GetMethod("ToLower", Type.EmptyTypes));
            var method = typeof(string).GetMethod("Contains", new[] { typeof(string) });
            for (int count = 0; count < searchTerms.Length; count++)
            {
                var searchTerm = searchTerms[count].ToLower();
                searchTerm = searchTerm.Replace("*", string.Empty);
                searchTerm = searchTerm.Replace("\"", string.Empty);
                Expression methodCallExpresssion = null;
                Expression rightExpression = null;
                if (searchTerm.Contains(NotOperator.TrimStart()))
                {
                    searchTerm = searchTerm.Replace(NotOperator.TrimStart(), string.Empty).Trim();
                    rightExpression = Expression.Constant(searchTerm);
                    methodCallExpresssion = Expression.Call(left, method, rightExpression);
                    methodCallExpresssion = Expression.Not(methodCallExpresssion);
                }
                else
                {
                    rightExpression = Expression.Constant(searchTerm);
                    methodCallExpresssion = Expression.Call(left, method, rightExpression);
                }

                if (count == 0)
                {
                    searchExpression = methodCallExpresssion;
                }
                else
                {
                    var conditionOperator = operatorIndexes[count - 1].Operator.Trim();
                    switch (conditionOperator)
                    {
                        case "and":
                            searchExpression = Expression.AndAlso(searchExpression, methodCallExpresssion);
                            break;
                        case "or":
                            searchExpression = Expression.OrElse(searchExpression, methodCallExpresssion);
                            break;
                        default:
                            break;
                    }
                }
            }

            if (addParentObjectNullCheck)
            {
                //Add Null check for Address Object before checking the value of City.
                var nullCheck = Expression.NotEqual(Expression.PropertyOrField(pe, "Address"), Expression.Constant(null, typeof(object)));
                nullorEmptyCheck = Expression.Not(Expression.Call(typeof(string), (typeof(string).GetMethod("IsNullOrEmpty")).Name, null, propertyExp));
                finalExpression = Expression.AndAlso(nullCheck, nullorEmptyCheck);
                finalExpression = Expression.AndAlso(finalExpression, searchExpression);
            }
            else
            {
                nullorEmptyCheck = Expression.Not(Expression.Call(typeof(string), (typeof(string).GetMethod("IsNullOrEmpty")).Name, null, propertyExp));
                finalExpression = Expression.AndAlso(nullorEmptyCheck, searchExpression);
            }

            return finalExpression;
        }
    }

    internal class OperatorIndexes
    {
        public int Index { get; set; }
        public string Operator { get; set; }
    }

 

This sample does “or” search on all the searchable fields but you can change it to target to certain fields so that your query becomes more clear and crisp.

GetExpression method is the heart of this it builds the dynamic expression tree for a particular field. Also expression tree is doing contains search you can change it to have exact search as well. This method add null check on the searchable fields and on the parent object as well (Address.City).

GetExpression expects to have searchTerms array as parameter, this array is build based on splitting SearchTerm based on and/or operator. This array is been built is appropriate sequence by private methods GetListOfSortedOperatorIndexes and AllIndexesOf, which are quite self explanatory, these methods could be more optimized as well.

Sample is tightly bound to User entity but this can be applied to any object which gets stored in mongo or any other NON SQL Db as long as it uses the concept of IQueryable and expression trees.

In above sample the for the sample query “John or Mary or Thaine”  CSharpMongoDynamicQueries returned expression tree something like shown below

User => (((((Not(IsNullOrEmpty(User.Employer)) 
AndAlso ((User.Employer.ToLower().Contains("john") 
OrElse User.Employer.ToLower().Contains("mary"))
OrElse User.Employer.ToLower().Contains("thaine"))) 
OrElse (Not(IsNullOrEmpty(User.FirstName)) 
AndAlso ((User.FirstName.ToLower().Contains("john") 
OrElse User.FirstName.ToLower().Contains("mary")) 
OrElse User.FirstName.ToLower().Contains("thaine")))) 
OrElse (Not(IsNullOrEmpty(User.LastName)) 
AndAlso ((User.LastName.ToLower().Contains("john") 
OrElse User.LastName.ToLower().Contains("mary")) 
OrElse User.LastName.ToLower().Contains("thaine")))) 
OrElse (Not(IsNullOrEmpty(User.Email)) 
AndAlso ((User.Email.ToLower().Contains("john") 
OrElse User.Email.ToLower().Contains("mary")) 
OrElse User.Email.ToLower().Contains("thaine")))) 
OrElse (((User.Address != null) 
AndAlso Not(IsNullOrEmpty(User.Address.City))) 
AndAlso ((User.Address.City.ToLower().Contains("john") 
OrElse User.Address.City.ToLower().Contains("mary")) 
OrElse User.Address.City.ToLower().Contains("thaine"))))

 

Source code for the sample can be found over here : Source Code

if you planning to run the sample please ensure to change the Mongo connection string in app.config. Change <dbusername>, <dbpassword> and <dbname> as appropriate as per your dev environment.

<add key="MongoConnectionString" value="mongodb://<dbusername>:<dbpassword>@ds030719.mlab.com:30719/<dbname>"/>

 

Please tweet me at @mytechnetnohows in case if I am not able to clarify things over here.