sigma-cls / wrong_prediction_test.jsonl
paramitopia's picture
Update SetFit classifier with new categories
81c4e40 verified
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Create a list of 10 conversation starters for a family dinner that encourage open communication and bonding between family members. Each conversation starter should be phrased as a question and be specific to family and relationships.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Please provide five unique first date night ideas, with each catering to different interests and budgets. Include an estimated cost for each idea, along with a brief description of the activity and why it might be appealing to different types of people.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Math", "instruction": "Design a garden path in a 10m x 10m garden, using a 10 x 10 grid to represent the area with each cell measuring 1m x 1m. Create a path (marked as \"P\") in the garden while the remaining places are used to plant flowers (marked as \"F\"). Each flower section separated by the path should have an area between 4 and 16 square meters. Ensure that all path cells are connected, allowing for continuous walking throughout the garden without stepping on the flowers. ", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "Generate a random alphanumeric string with exactly 20 characters.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Reasoning", "instruction": "The objective of the Item Elimination Game is to eliminate items in a 6x6 grid by aligning 3 or more of the same item consecutively in a row or column.\nHere are the rules:\n1. The grid consists of 6 rows and 6 columns, with each cell containing one of three different items represented by A, B, or C.\n2. In each step, you may remove items in several cells, provided that they form a consecutive sequence of 3 or more of the same item in the same row or column.\n3. Once the item in a cell is removed, it leaves a blank cell, represented by \"#\".\nBased on the rules, you should generate a 6x6 grid filled with 3 different kinds of items (A, B, and C) as the initial setup and play the game for one step: If you can identify items that form a sequence of 3 or more of the same item in a row or column, leaving blank cells (\"#\") in their place and generating the resulting grid; if you cannot find such a sequence, explain it.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Create a list of 5 off-the-beaten-path travel destinations in the magical world of Harry Potter created by J.K. Rowling. Each destination should be accompanied by a brief description (50-75 words) explaining what makes it unique, what magical or enchanting activities are available. Make sure to emphasize the magical aspect and avoid real-world locations to fully immerse readers in the fictional universe.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "I want to develop three hobbies: painting, dancing and photography. I only have time to do this when I finish work at 6 pm and before I go to sleep at 12 pm on weekdays, and I have the whole day available at weekends. I need a weekly plan that balances the time allocated to each hobby and ensures that each day's time for hobby development is different. Could you provide me with such a plan?", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Math", "instruction": "For the following grid where \"S\" denotes the starting point, \"F\" represents the finish line, \"#\" signifies walls or barriers, and \".\" indicates open paths, your task is to find the unique path from the start to the finish. Please provide step-by-step directions in the format \"Move right/down/left/up\" followed by the number of steps. Note that you can only move horizontally or vertically, and you cannot pass through walls or move diagonally. ", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Generate a Chronology of the United States from 1960 to present day, with at least 4 and no more than 6 events listed per decade. Start each event with an annotation \"-\", include the year it occurred, organize the decades in chronological order, and arrange the events within a decade in reverse chronological order.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "As an American visiting China for the first time, list 3 cultural practices or customs that may be surprising due to cultural differences. Provide a brief explanation for each item, comparing how it is different in China and America. Please avoid controversial topics or those that promote hate speech, discrimination, or misinformation. Focus on fostering cultural understanding and appreciation.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "Design a fun psychology test to assess a person's personality. Create a test consisting of 6 questions, with each question having 4 possible answers. Assign a separate score to each answer. Design test results that include a personality type and a description of the personality. Ensure that there are at least 5 personality categories. Make sure the score ranges for the test results correspond to the possible scores that a person can obtain after completing all 6 questions. Please provide clear instructions for calculating the final score and determining the corresponding personality category.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Generate a list of five distinct strategies to foster gender equality within a professional environment. Each strategy should be explained in 50 words or less. Index each strategy using Roman numerals. Also, associate a difficulty level with each strategy, ranging from 'Easy' to 'Medium' to 'Hard'. The difficulty levels should represent the practical implementation perspective and must be in increasing order, starting with 'Easy' for the first strategy and ending with 'Hard' for the fifth strategy.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Explain three innovative techniques presently employed in civil engineering for sustainable construction. Include real-life examples for each technique and discuss their environmental advantages.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "In the Matching Game, your objective is to clear a 10x10 grid by matching pairs of identical items. The grid contains 5 distinct item types (A, B, C, D, and E) and some blank cells (#). Additionally, the grid will have row and column numbers (from 1 to 10). At least half the grid cells should be occupied by items. To remove a pair of items, they must be of the same type and you must identify a path connecting them that only moves up, down, left, or right, passing exclusively through blank cells. Create a grid that ensures the inclusion of at least one valid pair of matching items that can be eliminated. In the given grid, locate and eliminate the first valid pair of matching items by indicating the row and column of each item and describing the connecting path between them.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Generate a list of 20 English words that you need to learn in 10 days. Plan to learn 5 new words each day for the first four days. Each day's word list will include both new words and words to review. According to Ebbinghaus's Forgetting Curve, you should review words on the 1st, 2nd, 4th, and 7th days after learning them. For example, words learned on Day 1 should be reviewed on Days 2, 3, 5, and 8. Now, generate the word list for each of the 10 days, with each day's list preceded by the title \"DAY\" and the day's number.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "In the Board Game Strategy Challenge, you are playing a simplified version of a strategy board game against an opponent. The game consists of a 5x5 grid, and you have three types of units: Knights (K), Archers (A), and Wizards (W). Each type of unit has specific movement and attack patterns. Your objective is to eliminate all of your opponent's units. Given the initial grid state in the input, your units are on the bottom row (W A K A W), and your opponent's units are on the top row (A K W A K). Your goal is to design specific movement and attack patterns for each type of unit, and then, based on the current grid state and rules you designed, describe your next move and show the grid after your move.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "Today is May 1, 2023. The following is a countdown calendar, what will the countdown calendar look like on May 14, 2023? Ensure to clearly denote the events or days that have already passed.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Provide five pairs of homophones in English, where each pair of words is spelled differently but pronounced the same, and explain their meanings.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Generate a list of 10 words that follow the consonant-vowel-consonant pattern.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Math", "instruction": "Create a 5x5 word puzzle grid filled with letters of the alphabet that contains at least 5 hidden words. Ensure that there are at least 3 valid words in the 5 horizontal rows (from left to right) and at least 2 valid words in the 5 vertical columns (from top to bottom). Please provide an example of such a grid and generate all the valid words.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Generate a list of 10 museums worldwide. This list should include at least one museum from each continent, excluding Antarctica, to ensure global coverage. For each museum, provide a concise description of what it offers and specify its exact location, including city and country. Make sure the information you provided is correct.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "The Movie Name Anagram Game requires you to unscramble anagrams of famous movie names. The task is to rearrange the letters to reveal the original movie name. Design three examples of scrambled words and provide their respective answers for the Movie Name Anagram Game.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Generate a transcript for a 4-year undergraduate student. The transcript should include the course name (including an indication of whether the course is a major or a selective course), the units of each course, the semester and year of each course, the grade of each course (using the alphabet letter), the major courses' GPA per year, and the overall GPA of all courses for the entire undergraduate time. Every semester should have at least 36 units of courses and at most 60 units of courses, with each course having either 6, 9, 12, or 15 units. Additionally, the transcript should include instructions on how to calculate the GPA, which is rated on a 4-point scale (with 4.0 as the highest and 0.0 as the lowest).", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Generate two lists of words:\n1. Consisting of at least 6 monosyllabic words.\n2. Incorporating at least 5 onomatopoeia.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Math", "instruction": "Create a sequence of 5 different dance formations for a group of 8 people, represented by A, B, C, D, E, F, G, and H. Use an 8x8 grid to represent the stage, with each cell containing a person's notation or \"#\" if unoccupied. Indicate the audience's position. Ensure the following:\n1. A is the lead dancer and always closer to the audience than any other dancer.\n2. In the first formation, dancers except for the lead dancer form a line parallel to the audience.\n3. In the third formation, some dancers create a triangular shape.\nProvide 5 distinct 8x8 grids showcasing each formation and also indicate where the audience is. ", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Generate twelve word clocks, each representing a unique time zone and featuring a different city. Distribute these clocks into two groups of six. The clocks in the first group should indicate the time using a 12-hour format, and the clocks in the second group should employ a 24-hour format. \n\nTo ensure geographical diversity, select at least one city from every continent, excluding Antarctica. Each clock should clearly display the city's name, the continent the city belongs to and the current date underneath. \n\nLastly, ensure that the displayed time and date for each city accurately reflect the real-world time differences between the various time zones.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "Design a simple implicit association test (IAT) for assessing gender stereotypes, provide a detailed description of the test, create a table for recording the test results, populate the table with hypothetical test results, and analyze the conclusions drawn from these results.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Please process the provided article related to the criminal case involving O.J. Simpson. From this article, identify and extract the following specific pieces of information: the accused, the victims, the members of the defense team, the allegations, the duration of the trial, the verdict, the civil lawsuit, the damages awarded, and any subsequent conviction. If any of these elements cannot be found in the article, list them as 'Missing'. Please present each entity with its corresponding extracted value in a new line, using the format: 'Entity Name: Extracted Value'.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "Generate a Python code snippet that constructs a two-hidden layer feedforward neural network using the PyTorch `torch.nn` module, ensuring that the architecture consists of an input layer, two hidden layers, and an output layer. The hidden layers should utilize the ReLU activation function and have a neuron count ranging between 32 and 128 to maintain a reasonably sized network. Include comments in the code to clarify each step of the process.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Create a list of the primary causes and effects of climate change. For each entry, focus on a single cause and its associated effects. Organize the list by writing \"Cause:\" followed by a one-sentence description, then start a new line with \"Effect:\" and provide a one-sentence explanation.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Extraction", "instruction": "Generate an airline information for a flight from Seattle to New York with two stops, using the following format:\n1. The first line should include the airline name and flight number.\n2. The second line should list the departure city, both stopover cities, and the destination city, connected by \"->\".\n3. The third line should indicate the corresponding departure and arrival times at each city, connected by \"->\". For the stopover cities, list both arrival and departure times, separated by a comma and a space, without using \"->\".\n4. The last line should indicate the total price of the flight.\nAdditionally, ensure that the provided layover times at the stopover cities are larger than 1 hour, and that the overall travel time is within a realistic range for the given route.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Please generate a list of 10 research papers published within the last 5 years (from 2023), focusing on the evaluation of pretrained language models. Present the list in APA citation style and arrange the papers in ascending chronological order.", "source": "kqsong/InFoBench"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "List the pros and cons of using two different names for the same thing. Make sure the word synonyms appears at least 3 time.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Can Batman beat Superman in a fair one on one fight?\nYou should just say \u201dMy answer is yes.\" or \"My answer is no.\" or \"My answer is maybe.\"", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "Can you provide me with the timetable for the next train to London? Please respond in less than 6 sentences.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "I need a list of the top 10 attorneys in the US. Your list should be in the format of 10 bullet points, following the format below:\n\n* Bullet 1\n* Bullet 2\n* ...", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Reasoning", "instruction": "If Bob beat Martha in a game of pool. And Martha beat Joe in a game of pool. Can Bob beat Joe in a game of pool?\n\nYour answer must contain exactly one of the following phrases:\nMy answer is yes.\nMy answer is no.\nMy answer is maybe.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Why didn't the 2022 winter olympics have the best ratings? Make sure to include the letter y at least 5 times, and include the keywords talented and tianjin.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "Is \"jiang\" a Chinese name? What are other names similar to \"jiang\"? Separate your two answers with ******", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "what is the difference between a levee and an embankment? Please respond to me only in Korean.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "Are the weather conditions in the Arctic very cold most of the year? Do not say 'yes' or 'no' throughout your entire response.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Is the sentence \"Mrs. Smith is the teacher of this class.\" grammatically correct? Give me exactly two different responses. Responses and only responses should be separated by 6 asterisk symbols: ******.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "What happened when the Tang dynasty of China was in power? Make sure to use the word war at least 8 times, and the word peace at least 10 times.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Math", "instruction": "A colt is 5 feet tall. It will grow 6 inches every month. How tall will it be in 3 years? Think step-by-step, then give your answer. Separate your thinking and the final answer by a line with just three \"*\" symbols: ***\nAt the end of your response, please explicitly add a postscript starting with P.P.S", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Explain to me how to ride a bike like I am a kid. Also, do not include the keywords \"slow\", \"like\" and \"kid\".", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Math", "instruction": "If a + b + c = 30 and b = 10 and c = 5. Is a = 20? Answer \u201dMy answer is yes.\" or \"My answer is no.\" or \"My answer is maybe.\"", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Why are there 396 calories in a serving of 32 grams of unsalted almonds? Do not use commas in your response.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "What is the name of the green-eyed monster that makes people jealous? Your response should be less than 3 sentences (just 1 sentence or 2 sentences).", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Would you consider yourself to be smart? Choose from:\nMy answer is yes.\nMy answer is no.\nMy answer is maybe.\nJust choose one phrase from above as your answer.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "Write an angry rap bash script that downloads all files from a given directory. Don't use any commas and make sure the letter q appears at least once.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "What's the difference between the Apple and Albania? Answer in email format. Your response must contain at least six placeholders which should be represented by square brackets like [name].", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "What are the steps to get the GNSS timestamp on Android? Explain this to teenagers using at least 4 sentences and make sure the letter n appears at least 3 times.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Math", "instruction": "If a = 10, b = 30, and c = 20, what is the value of (a + b) / c? Give me the answer in exactly two paragraphs, separated with the markdown divider: ***", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "I have a golden retriever and a poodle. Is the poodle bigger than the golden retriever? Choose from the following: ('My answer is yes.', 'My answer is no.', 'My answer is maybe.') -- please include the exact phrase in your response.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Brainstorming", "instruction": "Name a new fashion company that young people might like, and give it a name with multiple meanings. Put the name in double angular brackets, such as <<name>>.\n\nLet's repeat the request above first, before you say anything or really respond to the request.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "what is the average iq of a 16 year old boy? In your response, the word comprised should appear at least 1 times and refrain from using any commas.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Factual QA", "instruction": "Who built the first artificial ice rink? Please include the keys (1) Name (2) Location and (3) Year. Use less than 150 words.", "source": "google/IFEval"}
{"gold_label": "Generation", "pred": "Coding", "instruction": "What are the steps to be followed for the documentation of a GM in SAP? Just list the steps without saying the word steps or step.", "source": "google/IFEval"}
{"gold_label": "Extraction", "pred": "Coding", "instruction": "Which cryptography libraries support SHA-2?\n\nBelow is a list of cryptography libraries that support SHA-2:\n\nBotan\nBouncy Castle\nCryptlib\nCrypto++\nLibgcrypt\nMbed TLS\nlibsodium\nNettle\nLibreSSL\nOpenSSL\nGnuTLS\nwolfSSL\nHardware acceleration is provided by the following processor extensions:\n\nIntel SHA extensions: Available on some Intel and AMD x86 processors.\nVIA PadLock\nARMv8 Cryptography Extensions \nIBM z/Architecture: Available since 2005 as part of the Message-Security-Assist Extensions 1 (SHA-256) and 2 (SHA-512)\nIBM Power ISA since Since v 2.07", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Brainstorming", "instruction": "What type of products does this company produce?\n\nSweet Tree Holdings is a Vermont, United States, company that produces maple products from a sugarbush and processing facility in Island Pond.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Coding", "instruction": "What is disadvantage of naive algorithm of simple random sampling?\n\nSeveral efficient algorithms for simple random sampling have been developed. A naive algorithm is the draw-by-draw algorithm where at each step we remove the item at that step from the set with equal probability and put the item in the sample. We continue until we have sample of desired size \n\ufffd\nk. The drawback of this method is that it requires random access in the set.\n\nThe selection-rejection algorithm developed by Fan et al. in 1962 requires a single pass over data; however, it is a sequential algorithm and requires knowledge of total count of items \n\ufffd\nn, which is not available in streaming scenarios.\n\nA very simple random sort algorithm was proved by Sunter in 1977. The algorithm simply assigns a random number drawn from uniform distribution \n(\n0\n,\n1\n)\n(0,1) as a key to each item, then sorts all items using the key and selects the smallest \n\ufffd\nk items.\n\nJ. Vitter in 1985 proposed reservoir sampling algorithms, which are widely used. This algorithm does not require knowledge of the size of the population \n\ufffd\nn in advance, and uses constant space.\n\nRandom sampling can also be accelerated by sampling from the distribution of gaps between samples and skipping over the gaps.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Generation", "instruction": "Change the following passage to past tense\n\nThe episode focuses on two people: an elderly hospital doctor (Aleksander Bardini), who lives by himself in the ubiquitous Dekalog apartment block; and Dorota Geller (Krystyna Janda), a woman in her 30s who lives in the same building and is a violinist with the Philharmonic Orchestra.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Generation", "instruction": "Based on the following passage summarise how electricity became a driving force during the second industrial revolution.\n\nElectricity is the set of physical phenomena associated with the presence and motion of matter that has a property of electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described by Maxwell's equations. Various common phenomena are related to electricity, including lightning, static electricity, electric heating, electric discharges and many others.\nThe presence of either a positive or negative electric charge produces an electric field. The movement of electric charges is an electric current and produces a magnetic field. In most applications, a force acts on a charge with a magnitude given by Coulomb's law. Electric potential is typically measured in volts.\nElectricity is at the heart of many modern technologies, being used for:\na. Electric power where electric current is used to energise equipment;\nb. Electronics which deals with electrical circuits that involve active electrical components such as vacuum tubes, transistors, diodes and integrated circuits, and associated passive interconnection technologies.\nElectrical phenomena have been studied since antiquity, though progress in theoretical understanding remained slow until the 17th and 18th centuries. The theory of electromagnetism was developed in the 19th century, and by the end of that century electricity was being put to industrial and residential use by electrical engineers. The rapid expansion in electrical technology at this time transformed industry and society, becoming a driving force for the Second Industrial Revolution. Electricity's extraordinary versatility means it can be put to an almost limitless set of applications which include transport, heating, lighting, communications, and computation. Electrical power is now the backbone of modern industrial society.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Generation", "instruction": "Change this sentence to paste tense\n\nElstow is a special service area within the Rural Municipality of Blucher No. 343 in Saskatchewan, Canada.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Extraction", "pred": "Factual QA", "instruction": "When did the golden age for Jewish culture start and where was it?\n\nA few scholars give the start of the Golden Age as 711\u2013718, the Muslim conquest of Iberia.", "source": "databricks/databricks-dolly-15k"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Hannah went to the art studio. Between what times could they have gone?\nWe know that:\nHannah woke up at 6am.\nThomas saw Hannah reading at the library from 6am to 8am.\nDavid saw Hannah getting a coffee at the cafe from 8am to 10am.\nLeslie saw Hannah stretching at a yoga studio from 6pm to 7pm.\nAshley saw Hannah walking towards the Statue of Liberty from 7pm to 8pm.\nThe art studio was closed after 8pm.\nBetween what times could Hannah have gone to the art studio?\nOptions:\n(A) 10am to 6pm\n(B) 7pm to 8pm\n(C) 6am to 8am\n(D) 8am to 10am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Tiffany went to the market. Between what times could they have gone?\nWe know that:\nTiffany woke up at 5am.\nJason saw Tiffany driving to the water park from 5am to 6am.\nSteven saw Tiffany getting a coffee at the cafe from 6am to 12pm.\nJessica saw Tiffany buying lunch at the deli from 12pm to 4pm.\nThomas saw Tiffany working at the office from 5pm to 7pm.\nThe market was closed after 7pm.\nBetween what times could Tiffany have gone to the market?\nOptions:\n(A) 6am to 12pm\n(B) 4pm to 5pm\n(C) 5am to 6am\n(D) 5pm to 7pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, William went to the market. Between what times could they have gone?\nWe know that:\nWilliam woke up at 8am.\nThomas saw William fixing their computer at the electronic store from 8am to 10am.\nJames saw William waiting at the train station from 10am to 11am.\nSteven saw William sitting on a rooftop from 11am to 6pm.\nEmily saw William taking photos near the Leaning Tower of Pisa from 6pm to 9pm.\nThe market was closed after 10pm.\nBetween what times could William have gone to the market?\nOptions:\n(A) 11am to 6pm\n(B) 9pm to 10pm\n(C) 6pm to 9pm\n(D) 8am to 10am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Nancy went to the beach. Between what times could they have gone?\nWe know that:\nNancy woke up at 6am.\nJohn saw Nancy walking in the garden from 6am to 10am.\nTiffany saw Nancy working at the office from 10am to 11am.\nLeslie saw Nancy buying clothes at the mall from 11am to 12pm.\nMichael saw Nancy taking photos near the Leaning Tower of Pisa from 1pm to 6pm.\nThe beach was closed after 6pm.\nBetween what times could Nancy have gone to the beach?\nOptions:\n(A) 12pm to 1pm\n(B) 10am to 11am\n(C) 6am to 10am\n(D) 1pm to 6pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Tiffany went to the clothing store. Between what times could they have gone?\nWe know that:\nTiffany woke up at 6am.\nMichael saw Tiffany buying lunch at the deli from 12pm to 5pm.\nAndrew saw Tiffany playing tennis at the tennis court from 5pm to 6pm.\nJohn saw Tiffany driving to the water park from 6pm to 8pm.\nSarah saw Tiffany watching a movie at the theater from 8pm to 10pm.\nThe clothing store was closed after 10pm.\nBetween what times could Tiffany have gone to the clothing store?\nOptions:\n(A) 5pm to 6pm\n(B) 6pm to 8pm\n(C) 6am to 12pm\n(D) 8pm to 10pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, James went to the swimming pool. Between what times could they have gone?\nWe know that:\nJames woke up at 5am.\nAshley saw James sitting on a rooftop from 7am to 9am.\nHannah saw James taking photos near the Eiffel Tower from 9am to 7pm.\nWilliam saw James buying cookies at a bakery from 7pm to 8pm.\nLinda saw James waiting at the train station from 8pm to 10pm.\nThe swimming pool was closed after 10pm.\nBetween what times could James have gone to the swimming pool?\nOptions:\n(A) 7pm to 8pm\n(B) 8pm to 10pm\n(C) 5am to 7am\n(D) 7am to 9am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, William went to the art studio. Between what times could they have gone?\nWe know that:\nWilliam woke up at 8am.\nSteven saw William working at the office from 8am to 10am.\nDavid saw William working out at the gym from 10am to 3pm.\nKimberly saw William taking photos near the Leaning Tower of Pisa from 3pm to 5pm.\nMary saw William buying clothes at the mall from 6pm to 8pm.\nSean saw William taking photos near the Eiffel Tower from 8pm to 10pm.\nThe art studio was closed after 10pm.\nBetween what times could William have gone to the art studio?\nOptions:\n(A) 8pm to 10pm\n(B) 6pm to 8pm\n(C) 10am to 3pm\n(D) 5pm to 6pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, William went to the bakery. Between what times could they have gone?\nWe know that:\nWilliam woke up at 6am.\nDavid saw William waiting at the airport from 8am to 9am.\nMichael saw William attending class at the school from 9am to 1pm.\nSarah saw William working out at the gym from 1pm to 2pm.\nJessica saw William taking photos near the Eiffel Tower from 2pm to 9pm.\nThe bakery was closed after 9pm.\nBetween what times could William have gone to the bakery?\nOptions:\n(A) 1pm to 2pm\n(B) 8am to 9am\n(C) 9am to 1pm\n(D) 6am to 8am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Elizabeth went to the park. Between what times could they have gone?\nWe know that:\nElizabeth woke up at 10am.\nEmily saw Elizabeth buying a bike at the bike shop from 10am to 11am.\nLeslie saw Elizabeth buying lunch at the deli from 11am to 2pm.\nSarah saw Elizabeth playing tennis at the tennis court from 4pm to 8pm.\nThe park was closed after 8pm.\nBetween what times could Elizabeth have gone to the park?\nOptions:\n(A) 10am to 11am\n(B) 11am to 2pm\n(C) 4pm to 8pm\n(D) 2pm to 4pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Math", "instruction": "Today, James went to the restaurant. Between what times could they have gone?\nWe know that:\nJames woke up at 12pm.\nElizabeth saw James stretching at a yoga studio from 3pm to 4pm.\nMichael saw James fixing their computer at the electronic store from 4pm to 7pm.\nMary saw James working at the office from 7pm to 9pm.\nNancy saw James working out at the gym from 9pm to 10pm.\nThe restaurant was closed after 10pm.\nBetween what times could James have gone to the restaurant?\nOptions:\n(A) 12pm to 3pm\n(B) 3pm to 4pm\n(C) 7pm to 9pm\n(D) 9pm to 10pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Elizabeth went to the movies. Between what times could they have gone?\nWe know that:\nElizabeth woke up at 5am.\nSarah saw Elizabeth getting a coffee at the cafe from 5am to 3pm.\nSusan saw Elizabeth working out at the gym from 3pm to 4pm.\nJohn saw Elizabeth buying a phone at the electronics store from 4pm to 6pm.\nTiffany saw Elizabeth waiting at the train station from 6pm to 7pm.\nAndrew saw Elizabeth waiting at the airport from 7pm to 8pm.\nThe movies was closed after 9pm.\nBetween what times could Elizabeth have gone to the movies?\nOptions:\n(A) 8pm to 9pm\n(B) 4pm to 6pm\n(C) 5am to 3pm\n(D) 6pm to 7pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Math", "instruction": "Today, Ashley went to the bookstore. Between what times could they have gone?\nWe know that:\nAshley woke up at 5am.\nJason saw Ashley buying lunch at the deli from 11am to 1pm.\nEmily saw Ashley driving to the water park from 1pm to 3pm.\nKimberly saw Ashley stretching at a yoga studio from 3pm to 6pm.\nThe bookstore was closed after 6pm.\nBetween what times could Ashley have gone to the bookstore?\nOptions:\n(A) 1pm to 3pm\n(B) 11am to 1pm\n(C) 3pm to 6pm\n(D) 5am to 11am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, William went to the art studio. Between what times could they have gone?\nWe know that:\nWilliam woke up at 6am.\nAnthony saw William getting a coffee at the cafe from 6am to 7am.\nJohn saw William watching a movie at the theater from 8am to 10am.\nHannah saw William waiting at the train station from 10am to 12pm.\nMark saw William walking towards the Statue of Liberty from 12pm to 1pm.\nWilliam saw William working out at the gym from 1pm to 6pm.\nThe art studio was closed after 6pm.\nBetween what times could William have gone to the art studio?\nOptions:\n(A) 8am to 10am\n(B) 7am to 8am\n(C) 1pm to 6pm\n(D) 12pm to 1pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Math", "instruction": "Today, Sarah went to the gas station. Between what times could they have gone?\nWe know that:\nSarah woke up at 6am.\nSamantha saw Sarah reading at the library from 6am to 9am.\nMary saw Sarah taking photos near the Eiffel Tower from 9am to 11am.\nSteven saw Sarah buying cookies at a bakery from 11am to 2pm.\nThe gas station was closed after 9pm.\nBetween what times could Sarah have gone to the gas station?\nOptions:\n(A) 11am to 2pm\n(B) 2pm to 9pm\n(C) 9am to 11am\n(D) 6am to 9am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Anthony went to the basketball court. Between what times could they have gone?\nWe know that:\nAnthony woke up at 5am.\nSarah saw Anthony buying clothes at the mall from 5am to 7am.\nTiffany saw Anthony buying lunch at the deli from 7am to 12pm.\nSean saw Anthony watching a movie at the theater from 12pm to 2pm.\nEmily saw Anthony buying a phone at the electronics store from 2pm to 4pm.\nThe basketball court was closed after 9pm.\nBetween what times could Anthony have gone to the basketball court?\nOptions:\n(A) 5am to 7am\n(B) 12pm to 2pm\n(C) 4pm to 9pm\n(D) 7am to 12pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Sarah went to the art studio. Between what times could they have gone?\nWe know that:\nSarah woke up at 9am.\nRichard saw Sarah buying a bike at the bike shop from 9am to 12pm.\nBetty saw Sarah waiting at the train station from 12pm to 1pm.\nMichael saw Sarah buying clothes at the mall from 1pm to 3pm.\nJennifer saw Sarah working out at the gym from 3pm to 6pm.\nHannah saw Sarah buying a phone at the electronics store from 6pm to 8pm.\nThe art studio was closed after 10pm.\nBetween what times could Sarah have gone to the art studio?\nOptions:\n(A) 9am to 12pm\n(B) 3pm to 6pm\n(C) 12pm to 1pm\n(D) 8pm to 10pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Michael went to the park. Between what times could they have gone?\nWe know that:\nMichael woke up at 6am.\nLinda saw Michael walking towards the Statue of Liberty from 11am to 12pm.\nBetty saw Michael reading at the library from 12pm to 5pm.\nSteven saw Michael taking photos near the Eiffel Tower from 5pm to 7pm.\nDavid saw Michael getting a coffee at the cafe from 7pm to 8pm.\nThe park was closed after 8pm.\nBetween what times could Michael have gone to the park?\nOptions:\n(A) 6am to 11am\n(B) 12pm to 5pm\n(C) 5pm to 7pm\n(D) 11am to 12pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Susan went to the amusement park. Between what times could they have gone?\nWe know that:\nSusan woke up at 5am.\nAnthony saw Susan waiting at the train station from 5am to 8am.\nJames saw Susan watching a movie at the theater from 11am to 5pm.\nMary saw Susan sitting on a rooftop from 5pm to 9pm.\nThe amusement park was closed after 9pm.\nBetween what times could Susan have gone to the amusement park?\nOptions:\n(A) 8am to 11am\n(B) 5am to 8am\n(C) 5pm to 9pm\n(D) 11am to 5pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Elizabeth went to the swimming pool. Between what times could they have gone?\nWe know that:\nElizabeth woke up at 5am.\nJason saw Elizabeth taking photos near the Eiffel Tower from 5am to 8am.\nJohn saw Elizabeth playing tennis at the tennis court from 8am to 10am.\nDavid saw Elizabeth buying a bike at the bike shop from 10am to 3pm.\nThe swimming pool was closed after 4pm.\nBetween what times could Elizabeth have gone to the swimming pool?\nOptions:\n(A) 8am to 10am\n(B) 5am to 8am\n(C) 3pm to 4pm\n(D) 10am to 3pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Math", "instruction": "Today, Andrew went to the coffee shop. Between what times could they have gone?\nWe know that:\nAndrew woke up at 8am.\nElizabeth saw Andrew fixing their computer at the electronic store from 11am to 3pm.\nJennifer saw Andrew buying a bike at the bike shop from 3pm to 4pm.\nAnthony saw Andrew buying cookies at a bakery from 4pm to 6pm.\nBetty saw Andrew watching a movie at the theater from 6pm to 8pm.\nThe coffee shop was closed after 8pm.\nBetween what times could Andrew have gone to the coffee shop?\nOptions:\n(A) 3pm to 4pm\n(B) 6pm to 8pm\n(C) 8am to 11am\n(D) 11am to 3pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Andrew went to the museum. Between what times could they have gone?\nWe know that:\nAndrew woke up at 5am.\nJames saw Andrew walking in the garden from 5am to 6am.\nAshley saw Andrew getting a coffee at the cafe from 6am to 8am.\nDavid saw Andrew sitting on a rooftop from 8am to 10am.\nBetty saw Andrew buying lunch at the deli from 10am to 11am.\nThomas saw Andrew fixing their computer at the electronic store from 8pm to 10pm.\nThe museum was closed after 10pm.\nBetween what times could Andrew have gone to the museum?\nOptions:\n(A) 8am to 10am\n(B) 5am to 6am\n(C) 6am to 8am\n(D) 11am to 8pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Ashley went to the basketball court. Between what times could they have gone?\nWe know that:\nAshley woke up at 7am.\nLisa saw Ashley working at the office from 7am to 8am.\nMichael saw Ashley walking in the garden from 8am to 10am.\nSamantha saw Ashley attending class at the school from 1pm to 6pm.\nThe basketball court was closed after 6pm.\nBetween what times could Ashley have gone to the basketball court?\nOptions:\n(A) 8am to 10am\n(B) 1pm to 6pm\n(C) 10am to 1pm\n(D) 7am to 8am", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Sarah went to the soccer field. Between what times could they have gone?\nWe know that:\nSarah woke up at 5am.\nLinda saw Sarah taking photos near the Leaning Tower of Pisa from 5am to 6am.\nKimberly saw Sarah playing tennis at the tennis court from 8am to 10am.\nJennifer saw Sarah buying cookies at a bakery from 10am to 7pm.\nMichael saw Sarah working at the office from 7pm to 9pm.\nThe soccer field was closed after 9pm.\nBetween what times could Sarah have gone to the soccer field?\nOptions:\n(A) 5am to 6am\n(B) 6am to 8am\n(C) 8am to 10am\n(D) 7pm to 9pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "Today, Betty went to the coffee shop. Between what times could they have gone?\nWe know that:\nBetty woke up at 8am.\nEmily saw Betty buying a bike at the bike shop from 8am to 10am.\nSusan saw Betty walking in the garden from 10am to 12pm.\nJohn saw Betty working at the office from 1pm to 3pm.\nLisa saw Betty buying cookies at a bakery from 3pm to 4pm.\nMark saw Betty taking photos near the Eiffel Tower from 4pm to 5pm.\nThe coffee shop was closed after 5pm.\nBetween what times could Betty have gone to the coffee shop?\nOptions:\n(A) 1pm to 3pm\n(B) 8am to 10am\n(C) 10am to 12pm\n(D) 12pm to 1pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Math", "instruction": "Today, Sarah went to the art show. Between what times could they have gone?\nWe know that:\nSarah woke up at 11am.\nLisa saw Sarah working at the office from 11am to 2pm.\nHannah saw Sarah reading at the library from 2pm to 4pm.\nElizabeth saw Sarah fixing their computer at the electronic store from 4pm to 7pm.\nThe art show was closed after 9pm.\nBetween what times could Sarah have gone to the art show?\nOptions:\n(A) 11am to 2pm\n(B) 7pm to 9pm\n(C) 2pm to 4pm\n(D) 4pm to 7pm", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In an antique car show, there are five vehicles: a hatchback, a bus, a convertible, a tractor, and a minivan. The tractor is older than the bus. The minivan is newer than the bus. The hatchback is the second-newest. The minivan is older than the convertible.\nOptions:\n(A) The hatchback is the oldest\n(B) The bus is the oldest\n(C) The convertible is the oldest\n(D) The tractor is the oldest\n(E) The minivan is the oldest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a branch, there are five birds: a crow, a robin, a quail, a blue jay, and a falcon. The robin is to the left of the quail. The falcon is the third from the left. The crow is to the left of the falcon. The blue jay is the leftmost.\nOptions:\n(A) The crow is the second from the left\n(B) The robin is the second from the left\n(C) The quail is the second from the left\n(D) The blue jay is the second from the left\n(E) The falcon is the second from the left", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a branch, there are five birds: an owl, a crow, a raven, a robin, and a cardinal. The raven is to the right of the owl. The raven is the second from the left. The robin is to the left of the crow. The robin is the second from the right.\nOptions:\n(A) The owl is the leftmost\n(B) The crow is the leftmost\n(C) The raven is the leftmost\n(D) The robin is the leftmost\n(E) The cardinal is the leftmost", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In a golf tournament, there were five golfers: Dan, Mel, Ana, Eli, and Rob. Rob finished above Mel. Mel finished second. Ana finished above Eli. Dan finished last.\nOptions:\n(A) Dan finished first\n(B) Mel finished first\n(C) Ana finished first\n(D) Eli finished first\n(E) Rob finished first", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In a golf tournament, there were five golfers: Dan, Ana, Eli, Mya, and Amy. Amy finished above Eli. Ana finished above Amy. Mya finished below Eli. Dan finished last.\nOptions:\n(A) Dan finished third\n(B) Ana finished third\n(C) Eli finished third\n(D) Mya finished third\n(E) Amy finished third", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: watermelons, apples, mangoes, pears, and cantaloupes. The watermelons are more expensive than the cantaloupes. The mangoes are less expensive than the pears. The apples are the second-cheapest. The watermelons are less expensive than the mangoes.\nOptions:\n(A) The watermelons are the third-most expensive\n(B) The apples are the third-most expensive\n(C) The mangoes are the third-most expensive\n(D) The pears are the third-most expensive\n(E) The cantaloupes are the third-most expensive", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: kiwis, pears, peaches, loquats, and apples. The loquats are more expensive than the kiwis. The pears are more expensive than the peaches. The apples are more expensive than the loquats. The apples are less expensive than the peaches.\nOptions:\n(A) The kiwis are the third-most expensive\n(B) The pears are the third-most expensive\n(C) The peaches are the third-most expensive\n(D) The loquats are the third-most expensive\n(E) The apples are the third-most expensive", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a shelf, there are five books: a white book, a gray book, a purple book, a yellow book, and a black book. The yellow book is to the left of the white book. The black book is to the left of the yellow book. The purple book is the rightmost. The white book is to the left of the gray book.\nOptions:\n(A) The white book is the second from the left\n(B) The gray book is the second from the left\n(C) The purple book is the second from the left\n(D) The yellow book is the second from the left\n(E) The black book is the second from the left", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: apples, loquats, cantaloupes, watermelons, and kiwis. The watermelons are more expensive than the loquats. The watermelons are less expensive than the kiwis. The loquats are more expensive than the apples. The cantaloupes are the cheapest.\nOptions:\n(A) The apples are the third-most expensive\n(B) The loquats are the third-most expensive\n(C) The cantaloupes are the third-most expensive\n(D) The watermelons are the third-most expensive\n(E) The kiwis are the third-most expensive", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: loquats, peaches, pears, plums, and watermelons. The watermelons are the most expensive. The peaches are more expensive than the loquats. The plums are the second-cheapest. The pears are the third-most expensive.\nOptions:\n(A) The loquats are the cheapest\n(B) The peaches are the cheapest\n(C) The pears are the cheapest\n(D) The plums are the cheapest\n(E) The watermelons are the cheapest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In a golf tournament, there were five golfers: Eve, Rob, Ana, Eli, and Mya. Eli finished below Mya. Eve finished first. Ana finished below Eli. Rob finished second.\nOptions:\n(A) Eve finished first\n(B) Rob finished first\n(C) Ana finished first\n(D) Eli finished first\n(E) Mya finished first", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: oranges, apples, peaches, cantaloupes, and loquats. The loquats are less expensive than the cantaloupes. The cantaloupes are less expensive than the apples. The oranges are the most expensive. The apples are the third-most expensive.\nOptions:\n(A) The oranges are the second-most expensive\n(B) The apples are the second-most expensive\n(C) The peaches are the second-most expensive\n(D) The cantaloupes are the second-most expensive\n(E) The loquats are the second-most expensive", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In an antique car show, there are five vehicles: a sedan, a convertible, a station wagon, a bus, and a motorcyle. The sedan is the second-newest. The station wagon is newer than the convertible. The sedan is older than the motorcyle. The bus is the oldest.\nOptions:\n(A) The sedan is the oldest\n(B) The convertible is the oldest\n(C) The station wagon is the oldest\n(D) The bus is the oldest\n(E) The motorcyle is the oldest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In an antique car show, there are five vehicles: a truck, a station wagon, a motorcyle, a convertible, and a hatchback. The convertible is newer than the truck. The station wagon is newer than the hatchback. The convertible is older than the hatchback. The station wagon is the second-newest.\nOptions:\n(A) The truck is the oldest\n(B) The station wagon is the oldest\n(C) The motorcyle is the oldest\n(D) The convertible is the oldest\n(E) The hatchback is the oldest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In an antique car show, there are five vehicles: a truck, a station wagon, a motorcyle, a convertible, and a hatchback. The convertible is newer than the truck. The station wagon is newer than the hatchback. The convertible is older than the hatchback. The station wagon is the second-newest.\nOptions:\n(A) The truck is the second-oldest\n(B) The station wagon is the second-oldest\n(C) The motorcyle is the second-oldest\n(D) The convertible is the second-oldest\n(E) The hatchback is the second-oldest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: mangoes, pears, cantaloupes, watermelons, and loquats. The mangoes are more expensive than the loquats. The cantaloupes are more expensive than the mangoes. The watermelons are more expensive than the cantaloupes. The pears are the cheapest.\nOptions:\n(A) The mangoes are the second-cheapest\n(B) The pears are the second-cheapest\n(C) The cantaloupes are the second-cheapest\n(D) The watermelons are the second-cheapest\n(E) The loquats are the second-cheapest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In an antique car show, there are five vehicles: a hatchback, a convertible, a tractor, a truck, and a limousine. The tractor is newer than the hatchback. The limousine is the second-newest. The truck is older than the hatchback. The convertible is the second-oldest.\nOptions:\n(A) The hatchback is the second-oldest\n(B) The convertible is the second-oldest\n(C) The tractor is the second-oldest\n(D) The truck is the second-oldest\n(E) The limousine is the second-oldest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a branch, there are five birds: an owl, a hummingbird, a crow, a hawk, and a robin. The hawk is to the right of the owl. The hummingbird is to the right of the hawk. The crow is the second from the right. The robin is the second from the left.\nOptions:\n(A) The owl is the second from the left\n(B) The hummingbird is the second from the left\n(C) The crow is the second from the left\n(D) The hawk is the second from the left\n(E) The robin is the second from the left", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a shelf, there are five books: a blue book, a purple book, a yellow book, a black book, and a green book. The purple book is the third from the left. The yellow book is to the left of the black book. The green book is to the left of the purple book. The blue book is to the left of the green book.\nOptions:\n(A) The blue book is the rightmost\n(B) The purple book is the rightmost\n(C) The yellow book is the rightmost\n(D) The black book is the rightmost\n(E) The green book is the rightmost", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. In a golf tournament, there were five golfers: Ana, Rob, Amy, Dan, and Joe. Rob finished below Dan. Joe finished below Ana. Joe finished above Amy. Dan finished below Amy.\nOptions:\n(A) Ana finished last\n(B) Rob finished last\n(C) Amy finished last\n(D) Dan finished last\n(E) Joe finished last", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: peaches, watermelons, pears, apples, and cantaloupes. The watermelons are the second-cheapest. The apples are the cheapest. The peaches are the most expensive. The pears are less expensive than the cantaloupes.\nOptions:\n(A) The peaches are the second-cheapest\n(B) The watermelons are the second-cheapest\n(C) The pears are the second-cheapest\n(D) The apples are the second-cheapest\n(E) The cantaloupes are the second-cheapest", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. A fruit stand sells five fruits: apples, mangoes, oranges, plums, and pears. The oranges are more expensive than the mangoes. The plums are the cheapest. The apples are less expensive than the mangoes. The pears are more expensive than the oranges.\nOptions:\n(A) The apples are the third-most expensive\n(B) The mangoes are the third-most expensive\n(C) The oranges are the third-most expensive\n(D) The plums are the third-most expensive\n(E) The pears are the third-most expensive", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a branch, there are five birds: a quail, an owl, a raven, a falcon, and a robin. The owl is the leftmost. The robin is to the left of the raven. The quail is the rightmost. The raven is the third from the left.\nOptions:\n(A) The quail is the leftmost\n(B) The owl is the leftmost\n(C) The raven is the leftmost\n(D) The falcon is the leftmost\n(E) The robin is the leftmost", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a shelf, there are five books: a black book, a purple book, a yellow book, an orange book, and a red book. The yellow book is the rightmost. The black book is to the left of the orange book. The orange book is to the left of the purple book. The black book is the second from the left.\nOptions:\n(A) The black book is the second from the left\n(B) The purple book is the second from the left\n(C) The yellow book is the second from the left\n(D) The orange book is the second from the left\n(E) The red book is the second from the left", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Extraction", "instruction": "The following paragraphs each describe a set of five objects arranged in a fixed order. The statements are logically consistent within each paragraph. On a branch, there are five birds: an owl, a crow, a raven, a robin, and a cardinal. The raven is to the right of the owl. The raven is the second from the left. The robin is to the left of the crow. The robin is the second from the right.\nOptions:\n(A) The owl is the third from the left\n(B) The crow is the third from the left\n(C) The raven is the third from the left\n(D) The robin is the third from the left\n(E) The cardinal is the third from the left", "source": "lighteval/big_bench_hard"}
{"gold_label": "Reasoning", "pred": "Coding", "instruction": "Sam is doing a project.\nA project is written either in C++ or Python.\nIf Sam does a project written in Python, he will not use a Mac.\nSam is using a Mac.\nIf Sam uses a Mac, he will play a song.\nIf a song is not titled \"Perfect,\" Sam will never play it.\n\nIf a song is titled \"Perfect\", Sam will play it. What do you think about that statement?", "source": "tasksource/folio"}
{"gold_label": "Reasoning", "pred": "Brainstorming", "instruction": "Beasts of Prey is a fantasy novel or a science fiction novel, or both.\nScience fiction novels are not about mythological creatures\nBeasts of Prey Is about a creature known as the Shetani.\nShetanis are mythological.\n\nA shetani is either mythological or a creature. True or False?", "source": "tasksource/folio"}
{"gold_label": "Factual QA", "pred": "Brainstorming", "instruction": "What does boredom mean and what use does it have?", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Factual QA", "pred": "Brainstorming", "instruction": "Should you purchase dental insurance?", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Factual QA", "pred": "Brainstorming", "instruction": "What breed of dog is best for avoiding allergy problems?", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Factual QA", "pred": "Coding", "instruction": "What Google Sheets formula assigns a name to the value_expression results and returns the result of the formula_expression?", "source": "basicv8vc/SimpleQA"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "I'm making a text adventure. My character is blocked by a chasm. How might they get across? Explain why they might not be successful.", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "I'm writing a report on the importance of bees, can you give me a few reasons why bees are so important and why we need to protect them?", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Brainstorming", "pred": "Extraction", "instruction": "which of these songs are titled in english and which are titled in italian?\n\nNo.\tTitle\tLength\n1.\t\"Zitti e buoni\"\t3:14\n2.\t\"Lividi sui gomiti\"\t2:45\n3.\t\"I Wanna Be Your Slave\"\t2:53\n4.\t\"In nome del padre\"\t3:39\n5.\t\"For Your Love\"\t3:50\n6.\t\"La paura del buio\"\t3:29\n7.\t\"Vent'anni\"\t4:13\nTotal length:", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Let me know whether the plants listed below are annuals or perennials.\n- petunias\n- geraniums\n- black-eyed susans\n- foxglove\n- common milkweed\n- impatiens \n- daylilies", "source": "HuggingFaceH4/no_robots"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "What are the types of clouds and their characteristics?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Explain the process of photosynthesis.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Coding", "instruction": "What are some common programming languages used in web development?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "What are the differences between RNA and DNA?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Imagine your country has an energy crisis, brainstorm ways to reduce energy consumption.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Come up with 10 unique toppings for loaded french fries.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Please describe the impact on the environment caused by deforestation.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Come up with 10 unique Halloween costumes.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Coding", "instruction": "Please list different programming languages used for web development.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Coding", "instruction": "Recommend the best book for learning Python programming.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Coding", "instruction": "Recommend some books for learning how to program in Python.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Imagine you are stranded on a deserted island, list 10 items that you would need to survive.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "What are the most commonly used programming languages?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Can you give me some tips for writing an effective cover letter?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "How do vaccines work to prevent diseases?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "List the five major religions in the world.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Please brainstorm possible marketing strategies for a new product launch using the keyword \"digital marketing\".", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Can you suggest some creative menu options for a vegan restaurant?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Provide some unique features of African elephants.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "What are some of the most important events in World War II?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Please list the steps of the scientific method.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Name five countries that are located in the African continent.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Can you list some popular actors of Hollywood?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Can you list some famous inventors and their inventions?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Factual QA", "instruction": "Can you provide a list of all the countries in Asia in alphabetical order?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "brainstorm unique brand names for a startup that sells handmade jewelry.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Coding", "instruction": "List all the programming languages you know.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Come up with 10 potential names for a new coffee shop.", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Can you give some tips on how to create a resume?", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}
{"gold_label": "Brainstorming", "pred": "Generation", "instruction": "Come up with 5 marketing strategies for a vegan restaurant using the keyword \"plant-based\".", "source": "Wanfq/Explore_Instruct_Brainstorming_10k"}