DR. SHAFIQ UR REHMAN

Assistant Professor
  • Department of Computer Science
  • shafiq.rehman@namal.edu.pk
Summary

Dr. Khan holds a PhD from Capital University of Science and Technology (CUST), Islamabad. Following his doctorate, he has served in various academic roles, including Assistant Professor, Head of Department, Provost, Director of the Business Incubation Center, and as a member of several key committees. With over 8 years of combined industry and academic experience, his research interests encompass Natural Language Processing, Machine Learning, Artificial Intelligence, Information Retrieval and Graph Theory. Dr. Khan has published extensively in reputed journals and continues to contribute by supervising 15 BS projects, 9 MS and 1 PhD scholars in areas such as NLP, Machine Learning, Artificial Intelligence, Blockchain, and Information Retrieval.

Academic Background
PhD Computer Science ( Natural Language Processing, Machine Learning, Artificial Intelligence, Information Retrieval ) Capital University of Science and Technology 2019
MS Software Engineering (Software Engineering, Formal Methods in Software Engineering ) Bahira University, Islamabad 2010
BS Information Technology (Information Technology ) Gomal University 2007
Experience
Assistant Professor Capital University of Science and Technology 01-Dec-2022 - 15-Sep-2024
Assistant Professor University of Sialkot 15-Jan-2020 - 30-Nov-2022
Honours and Awards
Dean Honor Award Attaining CGPA > 3.5 07-Aug-2016
Journal Publications
A Hybrid Deep Learning Based Fake News Detection System Using Temporal Features 10-Jun-2024 Detecting fake news and missing information is gaining popularity, especially after social media and online news platforms advancements. Social media is the main and speediest source of fake news propagation, whereas online news websites contribute to fake news dissipation. In this study, we propose a framework to detect fake news using the temporal features of text and consider user feedback to determine whether the news is fake or not. In recent studies, the temporal features in text documents gain valuable consideration from Natural Language Processing and user feedback and only try to classify the textual data as fake or true. This research article indicates the impact of recurring and non-recurring events on fake and true news. We use different models such as LSTM, BERT, and CNN-BiLSTM to investigate, and it is concluded that from BERT, we get better results, and 70% of true news is recurring, and the rest of 30% is non-recurring.
Preventing 51% Attack By Using Consecutive Block Limits In Bitcoin 05-Mar-2024 In permissionless blockchain systems, Proof of Work (PoW) is utilized to address the issues of double-spending and transaction starvation. When an attacker acquires more than 50% of the hash power of the entire network, they gain the ability to engage in double-spending activities, posing a significant threat to the PoW consensus algorithm. This research focuses on the consensus algorithm employed in the Bitcoin system, explaining how it operates and the security challenges it faces. The proposed modification to the PoW algorithm imposes a restriction on miners: they are not allowed to accept consecutive blocks from the same miner into the final local blockchain to prevent the 51% attack problem. This modification supports transactions that require six confirmations. In the event an attacker attempts a 51% attack with a private chain that consists of fewer than 6 blocks, it becomes easier to detect a double-spending attack before accepting the attacker’s private chain. The modified algorithm introduces a “Safe Mode Detection Algorithm” that scrutinizes incoming blocks for adjustments at the top of the local blockchain. If inconsistencies are identified, the consensus algorithm proceeds cautiously by comparing the UTXO dictionaries from the attacker’s chain with those from the miner’s own blockchain. This meticulous comparison aims to detect instances of double-spending. If such instances are detected, the miner rejects the attacker’s chain, establishing a double-spend-free environment and thwarting 51% attacks.
Event-Dataset: Temporal information retrieval and text classification dataset 18-Aug-2019 Recently, Temporal Information Retrieval (TIR) has grabbed the major attention of the information retrieval community. TIR exploits the temporal dynamics in the information retrieval process and harnesses both textual relevance and temporal relevance to fulfill the temporal information requirements of a user Ur Rehman Khan et al., 2018. The focus time of document is an important temporal aspect which is defined as the time to which the content of the document refers Jatowt et al., 2015; Jatowt et al., 2013; Morbidoni et al., 2018, Khan et al., 2018. To the best of our knowledge, there does not exist any standard benchmark data set (publicly available) that holds the potential to comprehensively evaluate the performance of focus time assessment strategies. Considering these aspects, we have produced the Event-dataset, which is comprised of 35 queries and set of news articles for each query. Such that, where C represents the dataset, is query set and for each there is a set of news articles are sets of relevant documents and non-relevant documents respectively. Each query in the dataset represents a popular event. To annotate these articles into relevant and non-relevant, we have employed a user-study based evaluation method wherein a group of postgraduate students manually annotate the articles into the aforementioned categories. We believe that the generation of such dataset can provide an opportunity for the information retrieval researchers to use it as a benchmark to evaluate focus time assessment methods specifically and information retrieval methods generically.
Section-based focus time estimation of news articles 23-Nov-2018 Information retrieval systems embed temporal information for retrieving the news documents related to temporal queries. One of the important aspects of a news document is the focus time , a time to which the content of document refers. The contemporary state-of-the-art does not exploit focus time to retrieve relevant news document. This paper investigates the inverted pyramid news paradigm to determine the focus time of news documents by extracting temporal expressions, normalizing their value and assigning them a score on the basis of their position in the text. In this method, the news documents are first divided into three sections following the inverted pyramid news paradigm. This paper presents a comprehensive analysis of four methods for splitting news document into sections: the paragraph-based method, the words-based method, the sentence-based method, and the semantic-based method (SeBM). Temporal expressions in each section are assigned weights using a linear regression model. Finally, a scoring function is used to calculate a temporal score for each time expression appearing in the document. These temporal expressions are then ranked on the basis of their temporal score, where the most suitable expression appears on top. The effectiveness of the proposed method is evaluated on a diverse dataset of news related to popular events; the results revealed that the proposed splitting methods achieved an average error of less than 5.6 years, whereas the SeBM achieved a high precision score of 0.35 and 0.77 at positions 1 and 2, respectively.
Temporal specificity-based text classification for information retrieval 26-Jun-2018 Time is an important aspect in temporal information retrieval (TIR), a subfield of information retrieval (IR). Web search engines like Google or Bing are common examples of IR systems. An important constituent of a search engine is news retrieval, where users present their information needs in the form of temporal queries. Users are usually interested in news documents focusing on a particular time period. Existing search engines rarely fulfill the temporal information requirements as they ignore the temporal information available in the content of news documents, also known as document focus time. Furthermore, information related to multiple time periods in a news document makes the identification of document focus time a challenging task. Therefore, it is necessary to classify news documents based on temporal specificity before it is possible to use the temporal information in the retrieval process. In this study, we formulate the temporal specificity problem as a time-based classification task by classifying news documents into three temporal classes, ie high temporal specificity, medium temporal specificity, and low temporal specificity. For such classification, rule-based and temporal specificity score (TSS)-based classification approaches are proposed. In the former approach, news documents are classified using a defined set of rules that are based on temporal features. The later approach classifies news documents based on a TSS score using the temporal features. The results of the proposed techniques are compared with four machine learning classification algorithms: Bayes net, support vector machine, random forest, and decision tree. The …
Comparative analysis of information retrieval models on Quran dataset in cross-language information retrieval systems 15-Nov-2011 English is an international language used for communication worldwide but still many cannot read, write, understand, or communicate in English. On the other hand, the World Wide Web has unlimited resources of information in different languages which English native find challenging to understand. To avoid such barriers, Cross-Language Information Retrieval (CLIR) systems are proposed, which refers to document retrieval tasks across different languages. This work focuses on the performance evaluation of different Information Retrieval (IR) models in CLIR system using Quran dataset. Furthermore, this work also investigated the length of query and query expansion models for effective retrieval. The results show that different length of queries has an impact on the performance of the retrieval methods in terms of effectiveness. Hence, after comprehensive experiments, an appropriate length of query for Arabic CLIR system is suggested along with the best query expansion and retrieval model.
Courses
  • Discrete Structure
NAS-News Analytics Service using Spatiotemporal Information, HEC grant under NATIONAL RESEARCH PROGRAM FOR UNIVERSITIES (NRPU), Information, 2021- Rs-2.2 Million (Rs-2200000)
Smart Farmer: Crop yield prediction and disease detection In today's world, agriculture is facing numerous challenges, including climate change, pests, and diseases. To address these issues and ensure sustainable food production, innovative solutions are essential. One such solution is the development of "Smart Farmer" systems, which leverage advanced technologies like artificial intelligence, machine learning, and Internet of Things (IoT) to predict crop yield and detect diseases early. Key Components of a Smart Farmer System IoT Sensors: These sensors are deployed throughout the agricultural field to collect data on various environmental factors such as temperature, humidity, soil moisture, and light intensity. This data is crucial for understanding the conditions that impact crop growth and development. Image Processing: Cameras or drones equipped with image processing algorithms can analyze crops for signs of diseases, pests, or nutritional deficiencies. By identifying these issues early on, farmers can take timely corrective measures. Machine Learning Algorithms: Machine learning models can analyze historical data and real-time sensor readings to predict crop yield. These models can also identify patterns that indicate the onset of diseases or pests. Data Analytics: Advanced analytics tools can process and interpret large datasets to provide valuable insights into crop health and performance. This information can help farmers make informed decisions about irrigation, fertilization, and pest control.